A Large Dataset Imputation Approach Applied to Country Conflict Prediction Data

This study demonstrates an alternative stochastic imputation approach for large datasets when preferred commercial packages struggle to iterate due to numerical problems. A large country conflict dataset motivates the search to impute missing values well over a common threshold of 20% missingness. The methodology capitalizes on correlation while using model residuals to provide the uncertainty in estimating unknown values. Examination of the methodology provides insight toward choosing linear or nonlinear modeling terms. Static tolerances common in most packages are replaced with tailorable tolerances that exploit residuals to fit each data element. The methodology evaluation includes observing computation time, model fit, and the comparison of known  values to replaced values created through imputation. Overall, the country conflict dataset illustrates promise with modeling first-order interactions, while presenting a need for further refinement that mimics predictive mean matching.

The Quality Assessment of Seismic Reflection Survey Data Using Statistical Analysis: A Case Study of Fort Abbas Area, Cholistan Desert, Pakistan

In geophysical exploration surveys, the quality of acquired data holds significant importance before executing the data processing and interpretation phases. In this study, 2D seismic reflection survey data of Fort Abbas area, Cholistan Desert, Pakistan was taken as test case in order to assess its quality on statistical bases by using normalized root mean square error (NRMSE), Cronbach’s alpha test (α) and null hypothesis tests (t-test and F-test). The analysis challenged the quality of the acquired data and highlighted the significant errors in the acquired database. It is proven that the study area is plain, tectonically least affected and rich in oil and gas reserves. However, subsurface 3D modeling and contouring by using acquired database revealed high degrees of structural complexities and intense folding. The NRMSE had highest percentage of residuals between the estimated and predicted cases. The outcomes of hypothesis testing also proved the biasness and erraticness of the acquired database. Low estimated value of alpha (α) in Cronbach’s alpha test confirmed poor reliability of acquired database. A very low quality of acquired database needs excessive static correction or in some cases, reacquisition of data is also suggested which is most of the time not feasible on economic grounds. The outcomes of this study could be used to assess the quality of large databases and to further utilize as a guideline to establish database quality assessment models to make much more informed decisions in hydrocarbon exploration field.

Managing the Baltic Sea Region Resilience: Prevention, Treatment Actions and Circular Economy

The worldwide future sustainable economies are oriented towards the sea: the maritime economy is becoming one of the strongest driving forces in many regions as population growth is the highest in coastal areas. For hundreds of years sea resources were depleted unsustainably by fishing, mining, transportation, tourism, and waste. European Sustainable Development Strategy is identifying and developing actions to enable the EU to achieve a continuous, long-term improvement of the quality of life through the creation of sustainable communities. The aim of this paper is to provide insight in Baltic Sea Region case studies on implemented actions on tourism industry waste and beach wrack management in coastal areas, hazardous contaminants and plastic flow treatment from waste, wastewaters and stormwaters. These projects mentioned in study promote successful prevention of contaminant flows to the sea environments and provide perspectives for creation of valuable new products from residuals for future circular economy are the step forward to green innovation winning streak.

The Evaluation of Gravity Anomalies Based on Global Models by Land Gravity Data

The Earth system generates different phenomena that are observable at the surface of the Earth such as mass deformations and displacements leading to plate tectonics, earthquakes, and volcanism. The dynamic processes associated with the interior, surface, and atmosphere of the Earth affect the three pillars of geodesy: shape of the Earth, its gravity field, and its rotation. Geodesy establishes a characteristic structure in order to define, monitor, and predict of the whole Earth system. The traditional and new instruments, observables, and techniques in geodesy are related to the gravity field. Therefore, the geodesy monitors the gravity field and its temporal variability in order to transform the geodetic observations made on the physical surface of the Earth into the geometrical surface in which positions are mathematically defined. In this paper, the main components of the gravity field modeling, (Free-air and Bouguer) gravity anomalies are calculated via recent global models (EGM2008, EIGEN6C4, and GECO) over a selected study area. The model-based gravity anomalies are compared with the corresponding terrestrial gravity data in terms of standard deviation (SD) and root mean square error (RMSE) for determining the best fit global model in the study area at a regional scale in Turkey. The least SD (13.63 mGal) and RMSE (15.71 mGal) were obtained by EGM2008 for the Free-air gravity anomaly residuals. For the Bouguer gravity anomaly residuals, EIGEN6C4 provides the least SD (8.05 mGal) and RMSE (8.12 mGal). The results indicated that EIGEN6C4 can be a useful tool for modeling the gravity field of the Earth over the study area.

Dynamic Correlations and Portfolio Optimization between Islamic and Conventional Equity Indexes: A Vine Copula-Based Approach

This study examines conditional Value at Risk by applying the GJR-EVT-Copula model, and finds the optimal portfolio for eight Dow Jones Islamic-conventional pairs. Our methodology consists of modeling the data by a bivariate GJR-GARCH model in which we extract the filtered residuals and then apply the Peak over threshold model (POT) to fit the residual tails in order to model marginal distributions. After that, we use pair-copula to find the optimal portfolio risk dependence structure. Finally, with Monte Carlo simulations, we estimate the Value at Risk (VaR) and the conditional Value at Risk (CVaR). The empirical results show the VaR and CVaR values for an equally weighted portfolio of Dow Jones Islamic-conventional pairs. In sum, we found that the optimal investment focuses on Islamic-conventional US Market index pairs because of high investment proportion; however, all other index pairs have low investment proportion. These results deliver some real repercussions for portfolio managers and policymakers concerning to optimal asset allocations, portfolio risk management and the diversification advantages of these markets.

Flood Predicting in Karkheh River Basin Using Stochastic ARIMA Model

Floods have huge environmental and economic impact. Therefore, flood prediction is given a lot of attention due to its importance. This study analysed the annual maximum streamflow (discharge) (AMS or AMD) of Karkheh River in Karkheh River Basin for flood predicting using ARIMA model. For this purpose, we use the Box-Jenkins approach, which contains four-stage method model identification, parameter estimation, diagnostic checking and forecasting (predicting). The main tool used in ARIMA modelling was the SAS and SPSS software. Model identification was done by visual inspection on the ACF and PACF. SAS software computed the model parameters using the ML, CLS and ULS methods. The diagnostic checking tests, AIC criterion, RACF graph and RPACF graphs, were used for selected model verification. In this study, the best ARIMA models for Annual Maximum Discharge (AMD) time series was (4,1,1) with their AIC value of 88.87. The RACF and RPACF showed residuals’ independence. To forecast AMD for 10 future years, this model showed the ability of the model to predict floods of the river under study in the Karkheh River Basin. Model accuracy was checked by comparing the predicted and observation series by using coefficient of determination (R2).

Forecasting Electricity Spot Price with Generalized Long Memory Modeling: Wavelet and Neural Network

This aims of this paper is to forecast the electricity spot prices. First, we focus on modeling the conditional mean of the series so we adopt a generalized fractional -factor Gegenbauer process (k-factor GARMA). Secondly, the residual from the -factor GARMA model has used as a proxy for the conditional variance; these residuals were predicted using two different approaches. In the first approach, a local linear wavelet neural network model (LLWNN) has developed to predict the conditional variance using the Back Propagation learning algorithms. In the second approach, the Gegenbauer generalized autoregressive conditional heteroscedasticity process (G-GARCH) has adopted, and the parameters of the k-factor GARMA-G-GARCH model has estimated using the wavelet methodology based on the discrete wavelet packet transform (DWPT) approach. The empirical results have shown that the k-factor GARMA-G-GARCH model outperform the hybrid k-factor GARMA-LLWNN model, and find it is more appropriate for forecasts.

A Comparison of Inverse Simulation-Based Fault Detection in a Simple Robotic Rover with a Traditional Model-Based Method

Robotic rovers which are designed to work in extra-terrestrial environments present a unique challenge in terms of the reliability and availability of systems throughout the mission. Should some fault occur, with the nearest human potentially millions of kilometres away, detection and identification of the fault must be performed solely by the robot and its subsystems. Faults in the system sensors are relatively straightforward to detect, through the residuals produced by comparison of the system output with that of a simple model. However, faults in the input, that is, the actuators of the system, are harder to detect. A step change in the input signal, caused potentially by the loss of an actuator, can propagate through the system, resulting in complex residuals in multiple outputs. These residuals can be difficult to isolate or distinguish from residuals caused by environmental disturbances. While a more complex fault detection method or additional sensors could be used to solve these issues, an alternative is presented here. Using inverse simulation (InvSim), the inputs and outputs of the mathematical model of the rover system are reversed. Thus, for a desired trajectory, the corresponding actuator inputs are obtained. A step fault near the input then manifests itself as a step change in the residual between the system inputs and the input trajectory obtained through inverse simulation. This approach avoids the need for additional hardware on a mass- and power-critical system such as the rover. The InvSim fault detection method is applied to a simple four-wheeled rover in simulation. Additive system faults and an external disturbance force and are applied to the vehicle in turn, such that the dynamic response and sensor output of the rover are impacted. Basic model-based fault detection is then employed to provide output residuals which may be analysed to provide information on the fault/disturbance. InvSim-based fault detection is then employed, similarly providing input residuals which provide further information on the fault/disturbance. The input residuals are shown to provide clearer information on the location and magnitude of an input fault than the output residuals. Additionally, they can allow faults to be more clearly discriminated from environmental disturbances.

A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Combining ASTER Thermal Data and Spatial-Based Insolation Model for Identification of Geothermal Active Areas

In this study, we integrated ASTER thermal data with an area-based spatial insolation model to identify and delineate geothermally active areas in Yellowstone National Park (YNP). Two pairs of L1B ASTER day- and nighttime scenes were used to calculate land surface temperature. We employed the Emissivity Normalization Algorithm which separates temperature from emissivity to calculate surface temperature. We calculated the incoming solar radiation for the area covered by each of the four ASTER scenes using an insolation model and used this information to compute temperature due to solar radiation. We then identified the statistical thermal anomalies using land surface temperature and the residuals calculated from modeled temperatures and ASTER-derived surface temperatures. Areas that had temperatures or temperature residuals greater than 2σ and between 1σ and 2σ were considered ASTER-modeled thermal anomalies. The areas identified as thermal anomalies were in strong agreement with the thermal areas obtained from the YNP GIS database. Also the YNP hot springs and geysers were located within areas identified as anomalous thermal areas. The consistency between our results and known geothermally active areas indicate that thermal remote sensing data, integrated with a spatial-based insolation model, provides an effective means for identifying and locating areas of geothermal activities over large areas and rough terrain.

Least Squares Method Identification of Corona Current-Voltage Characteristics and Electromagnetic Field in Electrostatic Precipitator

This paper aims to analysis the behavior of DC corona discharge in wire-to-plate electrostatic precipitators (ESP). Currentvoltage curves are particularly analyzed. Experimental results show that discharge current is strongly affected by the applied voltage. The proposed method of current identification is to use the method of least squares. Least squares problems that of into two categories: linear or ordinary least squares and non-linear least squares, depending on whether or not the residuals are linear in all unknowns. The linear least-squares problem occurs in statistical regression analysis; it has a closed-form solution. A closed-form solution (or closed form expression) is any formula that can be evaluated in a finite number of standard operations. The non-linear problem has no closed-form solution and is usually solved by iterative.

Kinetics Study for the Recombinant Cellulosome to the Degradation of Chlorella Cell Residuals

In this study, lipid-deprived residuals of microalgae were hydrolyzed for the production of reducing sugars by using the recombinant Bacillus cellulosome, carrying eight genes from the Clostridium thermocellum ATCC27405. The obtained cellulosome was found to exist mostly in the broth supernatant with a cellulosome activity of 2.4 U/mL. Furthermore, the Michaelis-Menten constant (Km) and Vmax of cellulosome were found to be 14.832 g/L and 3.522 U/mL. The activation energy of the cellulosome to hydrolyze microalgae LDRs was calculated as 32.804 kJ/mol.

Relationship between Sums of Squares in Linear Regression and Semi-parametric Regression

In this paper, the sum of squares in linear regression is reduced to sum of squares in semi-parametric regression. We indicated that different sums of squares in the linear regression are similar to various deviance statements in semi-parametric regression. In addition to, coefficient of the determination derived in linear regression model is easily generalized to coefficient of the determination of the semi-parametric regression model. Then, it is made an application in order to support the theory of the linear regression and semi-parametric regression. In this way, study is supported with a simulated data example.

Quantitative Estimation of Periodicities in Lyari River Flow Routing

The hydrologic time series data display periodic structure and periodic autoregressive process receives considerable attention in modeling of such series. In this communication long term record of monthly waste flow of Lyari river is utilized to quantify by using PAR modeling technique. The parameters of model are estimated by using Frances & Paap methodology. This study shows that periodic autoregressive model of order 2 is the most parsimonious model for assessing periodicity in waste flow of the river. A careful statistical analysis of residuals of PAR (2) model is used for establishing goodness of fit. The forecast by using proposed model confirms significance and effectiveness of the model.

Using Low Permeability Sand-Fadr Mixture Membrane for Isolated Swelling Soil

Desert regions around the Nile valley in Upper Egypt contain great extent of swelling soil. Many different comment procedures of treatment of the swelling soils for construction such as pre-swelling, load balance OR soil replacement. One of the measure factors which affect the level of the aggressiveness of the swelling soil is the direction of the infiltration water directions within the swelling soils. In this paper a physical model was installed to measure the effect of water on the swelling soil with replacement using fatty acid distillation residuals (FADR) mixed with sand as thick sand-FADR mixture to prevent the water pathway arrive to the swelling soil. Testing program have been conducted on different artificial samples with different sand to FADR contents ratios (4%, 6%, and 9%) to get the optimum value fulfilling the impermeable replacement. The tests show that a FADR content of 9% is sufficient to produce impermeable replacement.

A Comparison of the Sum of Squares in Linear and Partial Linear Regression Models

In this paper, estimation of the linear regression model is made by ordinary least squares method and the partially linear regression model is estimated by penalized least squares method using smoothing spline. Then, it is investigated that differences and similarity in the sum of squares related for linear regression and partial linear regression models (semi-parametric regression models). It is denoted that the sum of squares in linear regression is reduced to sum of squares in partial linear regression models. Furthermore, we indicated that various sums of squares in the linear regression are similar to different deviance statements in partial linear regression. In addition to, coefficient of the determination derived in linear regression model is easily generalized to coefficient of the determination of the partial linear regression model. For this aim, it is made two different applications. A simulated and a real data set are considered to prove the claim mentioned here. In this way, this study is supported with a simulation and a real data example.

How to Win Passengers and Influence Motorists? Lessons Learned from a Comparative Study of Global Transit Systems

Due to the call of global warming effects, city planners aim at actions for reducing carbon emission. One of the approaches is to promote the usage of public transportation system toward the transit-oriented-development. For example, rapid transit system in Taipei city and Kaohsiung city are opening. However, until November 2008 the average daily patronage counted only 113,774 passengers at Kaohsiung MRT systems, much less than which was expected. Now the crucial questions: how the public transport competes with private transport? And more importantly, what factors would enhance the use of public transport? To give the answers to those questions, our study first applied regression to analyze the factors attracting people to use public transport around cities in the world. It is shown in our study that the number of MRT stations, city population, cost of living, transit fare, density, gasoline price, and scooter being a major mode of transport are the major factors. Subsequently, our study identified successful and unsuccessful cities in regard of the public transport usage based on the diagnosis of regression residuals. Finally, by comparing transportation strategies adopted by those successful cities, our conclusion stated that Kaohsiung City could apply strategies such as increasing parking fees, reducing parking spaces in downtown area, and reducing transfer time by providing more bus services and public bikes to promote the usage of public transport.

Bond Graph and Bayesian Networks for Reliable Diagnosis

Bond Graph as a unified multidisciplinary tool is widely used not only for dynamic modelling but also for Fault Detection and Isolation because of its structural and causal proprieties. A binary Fault Signature Matrix is systematically generated but to make the final binary decision is not always feasible because of the problems revealed by such method. The purpose of this paper is introducing a methodology for the improvement of the classical binary method of decision-making, so that the unknown and identical failure signatures can be treated to improve the robustness. This approach consists of associating the evaluated residuals and the components reliability data to build a Hybrid Bayesian Network. This network is used in two distinct inference procedures: one for the continuous part and the other for the discrete part. The continuous nodes of the network are the prior probabilities of the components failures, which are used by the inference procedure on the discrete part to compute the posterior probabilities of the failures. The developed methodology is applied to a real steam generator pilot process.

Improving Co-integration Trading Rule Profitability with Forecasts from an Artificial Neural Network

Co-integration models the long-term, equilibrium relationship of two or more related financial variables. Even if cointegration is found, in the short run, there may be deviations from the long run equilibrium relationship. The aim of this work is to forecast these deviations using neural networks and create a trading strategy based on them. A case study is used: co-integration residuals from Australian Bank Bill futures are forecast and traded using various exogenous input variables combined with neural networks. The choice of the optimal exogenous input variables chosen for each neural network, undertaken in previous work [1], is validated by comparing the forecasts and corresponding profitability of each, using a trading strategy.

Robust Parameter and Scale Factor Estimation in Nonstationary and Impulsive Noise Environment

The problem of FIR system parameter estimation has been considered in the paper. A new robust recursive algorithm for simultaneously estimation of parameters and scale factor of prediction residuals in non-stationary environment corrupted by impulsive noise has been proposed. The performance of derived algorithm has been tested by simulations.