Comparison of Methods of Estimation for Use in Goodness of Fit Tests for Binary Multilevel Models

It can be frequently observed that the data arising in our environment have a hierarchical or a nested structure attached with the data. Multilevel modelling is a modern approach to handle this kind of data. When multilevel modelling is combined with a binary response, the estimation methods get complex in nature and the usual techniques are derived from quasi-likelihood method. The estimation methods which are compared in this study are, marginal quasi-likelihood (order 1 & order 2) (MQL1, MQL2) and penalized quasi-likelihood (order 1 & order 2) (PQL1, PQL2). A statistical model is of no use if it does not reflect the given dataset. Therefore, checking the adequacy of the fitted model through a goodness-of-fit (GOF) test is an essential stage in any modelling procedure. However, prior to usage, it is also equally important to confirm that the GOF test performs well and is suitable for the given model. This study assesses the suitability of the GOF test developed for binary response multilevel models with respect to the method used in model estimation. An extensive set of simulations was conducted using MLwiN (v 2.19) with varying number of clusters, cluster sizes and intra cluster correlations. The test maintained the desirable Type-I error for models estimated using PQL2 and it failed for almost all the combinations of MQL. Power of the test was adequate for most of the combinations in all estimation methods except MQL1. Moreover, models were fitted using the four methods to a real-life dataset and performance of the test was compared for each model.

A Brief Study about Nonparametric Adherence Tests

The statistical study has become indispensable for various fields of knowledge. Not any different, in Geotechnics the study of probabilistic and statistical methods has gained power considering its use in characterizing the uncertainties inherent in soil properties. One of the situations where engineers are constantly faced is the definition of a probability distribution that represents significantly the sampled data. To be able to discard bad distributions, goodness-of-fit tests are necessary. In this paper, three non-parametric goodness-of-fit tests are applied to a data set computationally generated to test the goodness-of-fit of them to a series of known distributions. It is shown that the use of normal distribution does not always provide satisfactory results regarding physical and behavioral representation of the modeled parameters.

Effect of Load Ratio on Probability Distribution of Fatigue Crack Propagation Life in Magnesium Alloys

It is necessary to predict a fatigue crack propagation life for estimation of structural integrity. Because of an uncertainty and a randomness of a structural behavior, it is also required to analyze stochastic characteristics of the fatigue crack propagation life at a specified fatigue crack size. The essential purpose of this study is to find the effect of load ratio on probability distribution of the fatigue crack propagation life at a specified grown crack size and to confirm the good probability distribution in magnesium alloys under various fatigue load ratio conditions. To investigate a stochastic crack growth behavior, fatigue crack propagation experiments are performed in laboratory air under several conditions of fatigue load ratio using AZ31. By Anderson-Darling test, a goodness-of-fit test for probability distribution of the fatigue crack propagation life is performed. The effect of load ratio on variability of fatigue crack propagation life is also investigated.

Improve Safety Performance of Un-Signalized Intersections in Oman

The main objective of this paper is to provide a new methodology for road safety assessment in Oman through the development of suitable accident prediction models. GLM technique with Poisson or NBR using SAS package was carried out to develop these models. The paper utilized the accidents data of 31 un-signalized T-intersections during three years. Five goodness-of-fit measures were used to assess the overall quality of the developed models. Two types of models were developed separately; the flow-based models including only traffic exposure functions, and the full models containing both exposure functions and other significant geometry and traffic variables. The results show that, traffic exposure functions produced much better fit to the accident data. The most effective geometric variables were major-road mean speed, minor-road 85th percentile speed, major-road lane width, distance to the nearest junction, and right-turn curb radius. The developed models can be used for intersection treatment or upgrading and specify the appropriate design parameters of T-intersections. Finally, the models presented in this thesis reflect the intersection conditions in Oman and could represent the typical conditions in several countries in the middle east area, especially gulf countries.

Determining the Best Fitting Distributions for Minimum Flows of Streams in Gediz Basin

Today, the need for water sources is swiftly increasing due to population growth. At the same time, it is known that some regions will face with shortage of water and drought because of the global warming and climate change. In this context, evaluation and analysis of hydrological data such as the observed trends, drought and flood prediction of short term flow has great deal of importance. The most accurate selection probability distribution is important to describe the low flow statistics for the studies related to drought analysis. As in many basins In Turkey, Gediz River basin will be affected enough by the drought and will decrease the amount of used water. The aim of this study is to derive appropriate probability distributions for frequency analysis of annual minimum flows at 6 gauging stations of the Gediz Basin. After applying 10 different probability distributions, six different parameter estimation methods and 3 fitness test, the Pearson 3 distribution and general extreme values distributions were found to give optimal results.

Effect of Specimen Thickness on Probability Distribution of Grown Crack Size in Magnesium Alloys

The fatigue crack growth is stochastic because of the fatigue behavior having an uncertainty and a randomness. Therefore, it is necessary to determine the probability distribution of a grown crack size at a specific fatigue crack propagation life for maintenance of structure as well as reliability estimation. The essential purpose of this study is to present the good probability distribution fit for the grown crack size at a specified fatigue life in a rolled magnesium alloy under different specimen thickness conditions. Fatigue crack propagation experiments are carried out in laboratory air under three conditions of specimen thickness using AZ31 to investigate a stochastic crack growth behavior. The goodness-of-fit test for probability distribution of a grown crack size under different specimen thickness conditions is performed by Anderson-Darling test. The effect of a specimen thickness on variability of a grown crack size is also investigated.

Statistical Distributions of the Lapped Transform Coefficients for Images

Discrete Cosine Transform (DCT) based transform coding is very popular in image, video and speech compression due to its good energy compaction and decorrelating properties. However, at low bit rates, the reconstructed images generally suffer from visually annoying blocking artifacts as a result of coarse quantization. Lapped transform was proposed as an alternative to the DCT with reduced blocking artifacts and increased coding gain. Lapped transforms are popular for their good performance, robustness against oversmoothing and availability of fast implementation algorithms. However, there is no proper study reported in the literature regarding the statistical distributions of block Lapped Orthogonal Transform (LOT) and Lapped Biorthogonal Transform (LBT) coefficients. This study performs two goodness-of-fit tests, the Kolmogorov-Smirnov (KS) test and the 2- test, to determine the distribution that best fits the LOT and LBT coefficients. The experimental results show that the distribution of a majority of the significant AC coefficients can be modeled by the Generalized Gaussian distribution. The knowledge of the statistical distribution of transform coefficients greatly helps in the design of optimal quantizers that may lead to minimum distortion and hence achieve optimal coding efficiency.

Probability Distribution of Rainfall Depth at Hourly Time-Scale

Rainfall data at fine resolution and knowledge of its characteristics plays a major role in the efficient design and operation of agricultural, telecommunication, runoff and erosion control as well as water quality control systems. The paper is aimed to study the statistical distribution of hourly rainfall depth for 12 representative stations spread across Peninsular Malaysia. Hourly rainfall data of 10 to 22 years period were collected and its statistical characteristics were estimated. Three probability distributions namely, Generalized Pareto, Exponential and Gamma distributions were proposed to model the hourly rainfall depth, and three goodness-of-fit tests, namely, Kolmogorov-Sminov, Anderson-Darling and Chi-Squared tests were used to evaluate their fitness. Result indicates that the east cost of the Peninsular receives higher depth of rainfall as compared to west coast. However, the rainfall frequency is found to be irregular. Also result from the goodness-of-fit tests show that all the three models fit the rainfall data at 1% level of significance. However, Generalized Pareto fits better than Exponential and Gamma distributions and is therefore recommended as the best fit.

On the Comparison of Several Goodness of Fit tests under Simple Random Sampling and Ranked Set Sampling

Many works have been carried out to compare the efficiency of several goodness of fit procedures for identifying whether or not a particular distribution could adequately explain a data set. In this paper a study is conducted to investigate the power of several goodness of fit tests such as Kolmogorov Smirnov (KS), Anderson-Darling(AD), Cramer- von- Mises (CV) and a proposed modification of Kolmogorov-Smirnov goodness of fit test which incorporates a variance stabilizing transformation (FKS). The performances of these selected tests are studied under simple random sampling (SRS) and Ranked Set Sampling (RSS). This study shows that, in general, the Anderson-Darling (AD) test performs better than other GOF tests. However, there are some cases where the proposed test can perform as equally good as the AD test.

Automated Algorithm for Removing Continuous Flame Spectrum Based On Sampled Linear Bases

In this paper, an automated algorithm to estimate and remove the continuous baseline from measured spectra containing both continuous and discontinuous bands is proposed. The algorithm uses previous information contained in a Continuous Database Spectra (CDBS) to obtain a linear basis, with minimum number of sampled vectors, capable of representing a continuous baseline. The proposed algorithm was tested by using a CDBS of flame spectra where Principal Components Analysis and Non-negative Matrix Factorization were used to obtain linear bases. Thus, the radical emissions of natural gas, oil and bio-oil flames spectra at different combustion conditions were obtained. In order to validate the performance in the baseline estimation process, the Goodness-of-fit Coefficient and the Root Mean-squared Error quality metrics were evaluated between the estimated and the real spectra in absence of discontinuous emission. The achieved results make the proposed method a key element in the development of automatic monitoring processes strategies involving discontinuous spectral bands.

Probabilistic Characteristics of older PR Frames in the Mid-America Earthquake Region

Probabilistic characteristics of seismic responses of the Partially Restrained connection rotation (PRCR) and panel zone deformation (PZD) installed in older steel moment frames were investigated in accordance with statistical inference in decision-making process. The 4, 6 and 8 story older steel moment frames with clip angle and T-stub connections were designed and analyzed using 2%/50yrs ground motions in four cities of the Mid-America earthquake region. The probability density function and cumulative distribution function of PRCR and PZD were determined by the goodness-of-fit tests based on probabilistic parameters measured from the results of the nonlinear time-history analyses. The obtained probabilistic parameters and distributions can be used to find out what performance level mainly PR connections and panel zones satisfy and how many PR connections and panel zones experience a serious damage under the Mid-America ground motions.