Evaluation of Best-Fit Probability Distribution for Prediction of Extreme Hydrologic Phenomena

The probability distributions are the best method for forecasting of extreme hydrologic phenomena such as rainfall and flood flows. In this research, in order to determine suitable probability distribution for estimating of annual extreme rainfall and flood flows (discharge) series with different return periods, precipitation with 40 and discharge with 58 years time period had been collected from Karkheh River at Iran. After homogeneity and adequacy tests, data have been analyzed by Stormwater Management and Design Aid (SMADA) software and residual sum of squares (R.S.S). The best probability distribution was Log Pearson Type III with R.S.S value (145.91) and value (13.67) for peak discharge and Log Pearson Type III with R.S.S values (141.08) and (8.95) for maximum discharge in Jelogir Majin and Pole Zal stations, respectively. The best distribution for maximum precipitation in Jelogir Majin and Pole Zal stations was Log Pearson Type III distribution with R.S.S values (1.74&1.90) and then Pearson Type III distribution with R.S.S values (1.53&1.69). Overall, the Log Pearson Type III distributions are acceptable distribution types for representing statistics of extreme hydrologic phenomena in Karkheh River at Iran with the Pearson Type III distribution as a potential alternative.

Real Time Classification of Political Tendency of Twitter Spanish Users based on Sentiment Analysis

What people say on social media has turned into a rich source of information to understand social behavior. Specifically, the growing use of Twitter social media for political communication has arisen high opportunities to know the opinion of large numbers of politically active individuals in real time and predict the global political tendencies of a specific country. It has led to an increasing body of research on this topic. The majority of these studies have been focused on polarized political contexts characterized by only two alternatives. Unlike them, this paper tackles the challenge of forecasting Spanish political trends, characterized by multiple political parties, by means of analyzing the Twitters Users political tendency. According to this, a new strategy, named Tweets Analysis Strategy (TAS), is proposed. This is based on analyzing the users tweets by means of discovering its sentiment (positive, negative or neutral) and classifying them according to the political party they support. From this individual political tendency, the global political prediction for each political party is calculated. In order to do this, two different strategies for analyzing the sentiment analysis are proposed: one is based on Positive and Negative words Matching (PNM) and the second one is based on a Neural Networks Strategy (NNS). The complete TAS strategy has been performed in a Big-Data environment. The experimental results presented in this paper reveal that NNS strategy performs much better than PNM strategy to analyze the tweet sentiment. In addition, this research analyzes the viability of the TAS strategy to obtain the global trend in a political context make up by multiple parties with an error lower than 23%.

Statistical and Land Planning Study of Tourist Arrivals in Greece during 2005-2016

During the last 10 years, in spite of the economic crisis, the number of tourists arriving in Greece has increased, particularly during the tourist season from April to October. In this paper, the number of annual tourist arrivals is studied to explore their preferences with regard to the month of travel, the selected destinations, as well the amount of money spent. The collected data are processed with statistical methods, yielding numerical and graphical results. From the computation of statistical parameters and the forecasting with exponential smoothing, useful conclusions are arrived at that can be used by the Greek tourism authorities, as well as by tourist organizations, for planning purposes for the coming years. The results of this paper and the computed forecast can also be used for decision making by private tourist enterprises that are investing in Greece. With regard to the statistical methods, the method of Simple Exponential Smoothing of time series of data is employed. The search for a best forecast for 2017 and 2018 provides the value of the smoothing coefficient. For all statistical computations and graphics Microsoft Excel is used.

PM10 Prediction and Forecasting Using CART: A Case Study for Pleven, Bulgaria

Ambient air pollution with fine particulate matter (PM10) is a systematic permanent problem in many countries around the world. The accumulation of a large number of measurements of both the PM10 concentrations and the accompanying atmospheric factors allow for their statistical modeling to detect dependencies and forecast future pollution. This study applies the classification and regression trees (CART) method for building and analyzing PM10 models. In the empirical study, average daily air data for the city of Pleven, Bulgaria for a period of 5 years are used. Predictors in the models are seven meteorological variables, time variables, as well as lagged PM10 variables and some lagged meteorological variables, delayed by 1 or 2 days with respect to the initial time series, respectively. The degree of influence of the predictors in the models is determined. The selected best CART models are used to forecast future PM10 concentrations for two days ahead after the last date in the modeling procedure and show very accurate results.

Influence of a High-Resolution Land Cover Classification on Air Quality Modelling

Poor air quality is one of the main environmental causes of premature deaths worldwide, and mainly in cities, where the majority of the population lives. It is a consequence of successive land cover (LC) and use changes, as a result of the intensification of human activities. Knowing these landscape modifications in a comprehensive spatiotemporal dimension is, therefore, essential for understanding variations in air pollutant concentrations. In this sense, the use of air quality models is very useful to simulate the physical and chemical processes that affect the dispersion and reaction of chemical species into the atmosphere. However, the modelling performance should always be evaluated since the resolution of the input datasets largely dictates the reliability of the air quality outcomes. Among these data, the updated LC is an important parameter to be considered in atmospheric models, since it takes into account the Earth’s surface changes due to natural and anthropic actions, and regulates the exchanges of fluxes (emissions, heat, moisture, etc.) between the soil and the air. This work aims to evaluate the performance of the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem), when different LC classifications are used as an input. The influence of two LC classifications was tested: i) the 24-classes USGS (United States Geological Survey) LC database included by default in the model, and the ii) CLC (Corine Land Cover) and specific high-resolution LC data for Portugal, reclassified according to the new USGS nomenclature (33-classes). Two distinct WRF-Chem simulations were carried out to assess the influence of the LC on air quality over Europe and Portugal, as a case study, for the year 2015, using the nesting technique over three simulation domains (25 km2, 5 km2 and 1 km2 horizontal resolution). Based on the 33-classes LC approach, particular emphasis was attributed to Portugal, given the detail and higher LC spatial resolution (100 m x 100 m) than the CLC data (5000 m x 5000 m). As regards to the air quality, only the LC impacts on tropospheric ozone concentrations were evaluated, because ozone pollution episodes typically occur in Portugal, in particular during the spring/summer, and there are few research works relating to this pollutant with LC changes. The WRF-Chem results were validated by season and station typology using background measurements from the Portuguese air quality monitoring network. As expected, a better model performance was achieved in rural stations: moderate correlation (0.4 – 0.7), BIAS (10 – 21µg.m-3) and RMSE (20 – 30 µg.m-3), and where higher average ozone concentrations were estimated. Comparing both simulations, small differences grounded on the Leaf Area Index and air temperature values were found, although the high-resolution LC approach shows a slight enhancement in the model evaluation. This highlights the role of the LC on the exchange of atmospheric fluxes, and stresses the need to consider a high-resolution LC characterization combined with other detailed model inputs, such as the emission inventory, to improve air quality assessment.

Electricity Price Forecasting: A Comparative Analysis with Shallow-ANN and DNN

Electricity prices have sophisticated features such as high volatility, nonlinearity and high frequency that make forecasting quite difficult. Electricity price has a volatile and non-random character so that, it is possible to identify the patterns based on the historical data. Intelligent decision-making requires accurate price forecasting for market traders, retailers, and generation companies. So far, many shallow-ANN (artificial neural networks) models have been published in the literature and showed adequate forecasting results. During the last years, neural networks with many hidden layers, which are referred to as DNN (deep neural networks) have been using in the machine learning community. The goal of this study is to investigate electricity price forecasting performance of the shallow-ANN and DNN models for the Turkish day-ahead electricity market. The forecasting accuracy of the models has been evaluated with publicly available data from the Turkish day-ahead electricity market. Both shallow-ANN and DNN approach would give successful result in forecasting problems. Historical load, price and weather temperature data are used as the input variables for the models. The data set includes power consumption measurements gathered between January 2016 and December 2017 with one-hour resolution. In this regard, forecasting studies have been carried out comparatively with shallow-ANN and DNN models for Turkish electricity markets in the related time period. The main contribution of this study is the investigation of different shallow-ANN and DNN models in the field of electricity price forecast. All models are compared regarding their MAE (Mean Absolute Error) and MSE (Mean Square) results. DNN models give better forecasting performance compare to shallow-ANN. Best five MAE results for DNN models are 0.346, 0.372, 0.392, 0,402 and 0.409.

Deep Learning for Renewable Power Forecasting: An Approach Using LSTM Neural Networks

Load forecasting has become crucial in recent years and become popular in forecasting area. Many different power forecasting models have been tried out for this purpose. Electricity load forecasting is necessary for energy policies, healthy and reliable grid systems. Effective power forecasting of renewable energy load leads the decision makers to minimize the costs of electric utilities and power plants. Forecasting tools are required that can be used to predict how much renewable energy can be utilized. The purpose of this study is to explore the effectiveness of LSTM-based neural networks for estimating renewable energy loads. In this study, we present models for predicting renewable energy loads based on deep neural networks, especially the Long Term Memory (LSTM) algorithms. Deep learning allows multiple layers of models to learn representation of data. LSTM algorithms are able to store information for long periods of time. Deep learning models have recently been used to forecast the renewable energy sources such as predicting wind and solar energy power. Historical load and weather information represent the most important variables for the inputs within the power forecasting models. The dataset contained power consumption measurements are gathered between January 2016 and December 2017 with one-hour resolution. Models use publicly available data from the Turkish Renewable Energy Resources Support Mechanism. Forecasting studies have been carried out with these data via deep neural networks approach including LSTM technique for Turkish electricity markets. 432 different models are created by changing layers cell count and dropout. The adaptive moment estimation (ADAM) algorithm is used for training as a gradient-based optimizer instead of SGD (stochastic gradient). ADAM performed better than SGD in terms of faster convergence and lower error rates. Models performance is compared according to MAE (Mean Absolute Error) and MSE (Mean Squared Error). Best five MAE results out of 432 tested models are 0.66, 0.74, 0.85 and 1.09. The forecasting performance of the proposed LSTM models gives successful results compared to literature searches.

Time Series Modelling and Prediction of River Runoff: Case Study of Karkheh River, Iran

Rainfall and runoff phenomenon is a chaotic and complex outcome of nature which requires sophisticated modelling and simulation methods for explanation and use. Time Series modelling allows runoff data analysis and can be used as forecasting tool. In the paper attempt is made to model river runoff data and predict the future behavioural pattern of river based on annual past observations of annual river runoff. The river runoff analysis and predict are done using ARIMA model. For evaluating the efficiency of prediction to hydrological events such as rainfall, runoff and etc., we use the statistical formulae applicable. The good agreement between predicted and observation river runoff coefficient of determination (R2) display that the ARIMA (4,1,1) is the suitable model for predicting Karkheh River runoff at Iran.

Flood Predicting in Karkheh River Basin Using Stochastic ARIMA Model

Floods have huge environmental and economic impact. Therefore, flood prediction is given a lot of attention due to its importance. This study analysed the annual maximum streamflow (discharge) (AMS or AMD) of Karkheh River in Karkheh River Basin for flood predicting using ARIMA model. For this purpose, we use the Box-Jenkins approach, which contains four-stage method model identification, parameter estimation, diagnostic checking and forecasting (predicting). The main tool used in ARIMA modelling was the SAS and SPSS software. Model identification was done by visual inspection on the ACF and PACF. SAS software computed the model parameters using the ML, CLS and ULS methods. The diagnostic checking tests, AIC criterion, RACF graph and RPACF graphs, were used for selected model verification. In this study, the best ARIMA models for Annual Maximum Discharge (AMD) time series was (4,1,1) with their AIC value of 88.87. The RACF and RPACF showed residuals’ independence. To forecast AMD for 10 future years, this model showed the ability of the model to predict floods of the river under study in the Karkheh River Basin. Model accuracy was checked by comparing the predicted and observation series by using coefficient of determination (R2).

A Comparative Analysis of the Performance of COSMO and WRF Models in Quantitative Rainfall Prediction

The Numerical weather prediction (NWP) models are considered powerful tools for guiding quantitative rainfall prediction. A couple of NWP models exist and are used at many operational weather prediction centers. This study considers two models namely the Consortium for Small–scale Modeling (COSMO) model and the Weather Research and Forecasting (WRF) model. It compares the models’ ability to predict rainfall over Uganda for the period 21st April 2013 to 10th May 2013 using the root mean square (RMSE) and the mean error (ME). In comparing the performance of the models, this study assesses their ability to predict light rainfall events and extreme rainfall events. All the experiments used the default parameterization configurations and with same horizontal resolution (7 Km). The results show that COSMO model had a tendency of largely predicting no rain which explained its under–prediction. The COSMO model (RMSE: 14.16; ME: -5.91) presented a significantly (p = 0.014) higher magnitude of error compared to the WRF model (RMSE: 11.86; ME: -1.09). However the COSMO model (RMSE: 3.85; ME: 1.39) performed significantly (p = 0.003) better than the WRF model (RMSE: 8.14; ME: 5.30) in simulating light rainfall events. All the models under–predicted extreme rainfall events with the COSMO model (RMSE: 43.63; ME: -39.58) presenting significantly higher error magnitudes than the WRF model (RMSE: 35.14; ME: -26.95). This study recommends additional diagnosis of the models’ treatment of deep convection over the tropics.

A Non-Linear Eddy Viscosity Model for Turbulent Natural Convection in Geophysical Flows

Eddy viscosity models in turbulence modeling can be mainly classified as linear and nonlinear models. Linear formulations are simple and require less computational resources but have the disadvantage that they cannot predict actual flow pattern in complex geophysical flows where streamline curvature and swirling motion are predominant. A constitutive equation of Reynolds stress anisotropy is adopted for the formulation of eddy viscosity including all the possible higher order terms quadratic in the mean velocity gradients, and a simplified model is developed for actual oceanic flows where only the vertical velocity gradients are important. The new model is incorporated into the one dimensional General Ocean Turbulence Model (GOTM). Two realistic oceanic test cases (OWS Papa and FLEX' 76) have been investigated. The new model predictions match well with the observational data and are better in comparison to the predictions of the two equation k-epsilon model. The proposed model can be easily incorporated in the three dimensional Princeton Ocean Model (POM) to simulate a wide range of oceanic processes. Practically, this model can be implemented in the coastal regions where trasverse shear induces higher vorticity, and for prediction of flow in estuaries and lakes, where depth is comparatively less. The model predictions of marine turbulence and other related data (e.g. Sea surface temperature, Surface heat flux and vertical temperature profile) can be utilized in short term ocean and climate forecasting and warning systems.

Forecasting of Scaffolding Work Comfort Parameters Based on Data from Meteorological Stations

Work at height, such as construction works on scaffoldings, is associated with a considerable risk. Scaffolding workers are usually exposed to changing weather conditions what can additionally increase the risk of dangerous situations. Therefore, it is very important to foresee the risk of adverse conditions to which the worker may be exposed. The data from meteorological stations may be used to asses this risk. However, the dependency between weather conditions on a scaffolding and in the vicinity of meteorological station, should be determined. The paper presents an analysis of two selected environmental parameters which have influence on the behavior of workers – air temperature and wind speed. Measurements of these parameters were made between April and November of 2016 on ten scaffoldings located in different parts of Poland. They were compared with the results taken from the meteorological stations located closest to the studied scaffolding. The results gathered from the construction sites and meteorological stations were not the same, but statistical analyses have shown that they were correlated.

Forecasting Electricity Spot Price with Generalized Long Memory Modeling: Wavelet and Neural Network

This aims of this paper is to forecast the electricity spot prices. First, we focus on modeling the conditional mean of the series so we adopt a generalized fractional -factor Gegenbauer process (k-factor GARMA). Secondly, the residual from the -factor GARMA model has used as a proxy for the conditional variance; these residuals were predicted using two different approaches. In the first approach, a local linear wavelet neural network model (LLWNN) has developed to predict the conditional variance using the Back Propagation learning algorithms. In the second approach, the Gegenbauer generalized autoregressive conditional heteroscedasticity process (G-GARCH) has adopted, and the parameters of the k-factor GARMA-G-GARCH model has estimated using the wavelet methodology based on the discrete wavelet packet transform (DWPT) approach. The empirical results have shown that the k-factor GARMA-G-GARCH model outperform the hybrid k-factor GARMA-LLWNN model, and find it is more appropriate for forecasts.

Sphere in Cube Grid Approach to Modelling of Shale Gas Production Using Non-Linear Flow Mechanisms

Shale gas is one of the most rapidly growing forms of natural gas. Unconventional natural gas deposits are difficult to characterize overall, but in general are often lower in resource concentration and dispersed over large areas. Moreover, gas is densely packed into the matrix through adsorption which accounts for large volume of gas reserves. Gas production from tight shale deposits are made possible by extensive and deep well fracturing which contacts large fractions of the formation. The conventional reservoir modelling and production forecasting methods, which rely on fluid-flow processes dominated by viscous forces, have proved to be very pessimistic and inaccurate. This paper presents a new approach to forecast shale gas production by detailed modeling of gas desorption, diffusion and non-linear flow mechanisms in combination with statistical representation of these processes. The representation of the model involves a cube as a porous media where free gas is present and a sphere (SiC: Sphere in Cube model) inside it where gas is adsorbed on to the kerogen or organic matter. Further, the sphere is considered consisting of many layers of adsorbed gas in an onion-like structure. With pressure decline, the gas desorbs first from the outer most layer of sphere causing decrease in its molecular concentration. The new available surface area and change in concentration triggers the diffusion of gas from kerogen. The process continues until all the gas present internally diffuses out of the kerogen, gets adsorbs onto available surface area and then desorbs into the nanopores and micro-fractures in the cube. Each SiC idealizes a gas pathway and is characterized by sphere diameter and length of the cube. The diameter allows to model gas storage, diffusion and desorption; the cube length takes into account the pathway for flow in nanopores and micro-fractures. Many of these representative but general cells of the reservoir are put together and linked to a well or hydraulic fracture. The paper quantitatively describes these processes as well as clarifies the geological conditions under which a successful shale gas production could be expected. A numerical model has been derived which is then compiled on FORTRAN to develop a simulator for the production of shale gas by considering the spheres as a source term in each of the grid blocks. By applying SiC to field data, we demonstrate that the model provides an effective way to quickly access gas production rates from shale formations. We also examine the effect of model input properties on gas production.

Input Data Balancing in a Neural Network PM-10 Forecasting System

Recently PM-10 has become a social and global issue. It is one of major air pollutants which affect human health. Therefore, it needs to be forecasted rapidly and precisely. However, PM-10 comes from various emission sources, and its level of concentration is largely dependent on meteorological and geographical factors of local and global region, so the forecasting of PM-10 concentration is very difficult. Neural network model can be used in the case. But, there are few cases of high concentration PM-10. It makes the learning of the neural network model difficult. In this paper, we suggest a simple input balancing method when the data distribution is uneven. It is based on the probability of appearance of the data. Experimental results show that the input balancing makes the neural networks’ learning easy and improves the forecasting rates.

A Comparative Analysis of Artificial Neural Network and Autoregressive Integrated Moving Average Model on Modeling and Forecasting Exchange Rate

This paper examines the forecasting performance of Autoregressive Integrated Moving Average (ARIMA) and Artificial Neural Networks (ANN) models with the published exchange rate obtained from South African Reserve Bank (SARB). ARIMA is one of the popular linear models in time series forecasting for the past decades. ARIMA and ANN models are often compared and literature revealed mixed results in terms of forecasting performance. The study used the MSE and MAE to measure the forecasting performance of the models. The empirical results obtained reveal the superiority of ARIMA model over ANN model. The findings further resolve and clarify the contradiction reported in literature over the superiority of ARIMA and ANN models.

Forecasting the Volatility of Geophysical Time Series with Stochastic Volatility Models

This work is devoted to the study of modeling geophysical time series. A stochastic technique with time-varying parameters is used to forecast the volatility of data arising in geophysics. In this study, the volatility is defined as a logarithmic first-order autoregressive process. We observe that the inclusion of log-volatility into the time-varying parameter estimation significantly improves forecasting which is facilitated via maximum likelihood estimation. This allows us to conclude that the estimation algorithm for the corresponding one-step-ahead suggested volatility (with ±2 standard prediction errors) is very feasible since it possesses good convergence properties.

Forecasting Materials Demand from Multi-Source Ordering

The downstream manufactures will order their materials from different upstream suppliers to maintain a certain level of the demand. This paper proposes a bivariate model to portray this phenomenon of material demand. We use empirical data to estimate the parameters of model and evaluate the RMSD of model calibration. The results show that the model has better fitness.

Modern Trends in Foreign Direct Investments in Georgia

Foreign direct investment is a driving force in the development of the interdependent national economies, and the study and analysis of investments is an urgent problem. It is particularly important for transitional economies, such as Georgia, and the study and analysis of investments is an urgent problem. Consequently, the goal of the research is the study and analysis of direct foreign investments in Georgia, and identification and forecasting of modern trends, and covers the period of 2006-2015. The study uses the methods of statistical observation, grouping and analysis, the methods of analytical indicators of time series, trend identification and the predicted values are calculated, as well as various literary and Internet sources relevant to the research. The findings showed that modern investment policy In Georgia is favorable for domestic as well as foreign investors. Georgia is still a net importer of investments. In 2015, the top 10 investing countries was led by Azerbaijan, United Kingdom and Netherlands, and the largest share of FDIs were allocated in the transport and communication sector; the financial sector was the second, followed by the health and social work sector, and the same trend will continue in the future. 

Load Forecasting in Microgrid Systems with R and Cortana Intelligence Suite

Energy production optimization has been traditionally very important for utilities in order to improve resource consumption. However, load forecasting is a challenging task, as there are a large number of relevant variables that must be considered, and several strategies have been used to deal with this complex problem. This is especially true also in microgrids where many elements have to adjust their performance depending on the future generation and consumption conditions. The goal of this paper is to present a solution for short-term load forecasting in microgrids, based on three machine learning experiments developed in R and web services built and deployed with different components of Cortana Intelligence Suite: Azure Machine Learning, a fully managed cloud service that enables to easily build, deploy, and share predictive analytics solutions; SQL database, a Microsoft database service for app developers; and PowerBI, a suite of business analytics tools to analyze data and share insights. Our results show that Boosted Decision Tree and Fast Forest Quantile regression methods can be very useful to predict hourly short-term consumption in microgrids; moreover, we found that for these types of forecasting models, weather data (temperature, wind, humidity and dew point) can play a crucial role in improving the accuracy of the forecasting solution. Data cleaning and feature engineering methods performed in R and different types of machine learning algorithms (Boosted Decision Tree, Fast Forest Quantile and ARIMA) will be presented, and results and performance metrics discussed.