A Comparison of Air Pollution in Developed and Developing Cities: A Case Study of London and Beijing

With the rapid development of industrialization, countries in different stages of development in the world have gradually begun to pay attention to the impact of air pollution on health and the environment. Air control in developed countries is an effective reference for air control in developing countries. Artificial intelligence and other technologies also play a positive role in the prediction of air pollution. By comparing the annual changes of pollution in London and Beijing, this paper concludes that the pollution in developed cities is relatively low and stable, while the pollution in Beijing is relatively heavy and unstable, but is clearly improving. In addition, by analyzing the changes of major pollutants in Beijing in the past eight years, it is concluded that all pollutants except O3 show a significant downward trend. In addition, all pollutants except O3 have certain correlation. For example, PM10 and PM2.5 have the greatest influence on air quality index (AQI). Python, which is commonly used by artificial intelligence, is used as the main software to establish two models, support vector machine (SVM) and linear regression. By comparing the two models under the same conditions, it is concluded that SVM has higher accuracy in pollution prediction. The results of this study provide valuable reference for pollution control and prediction in developing countries.

Injury Prediction for Soccer Players Using Machine Learning

Injuries in professional sports occur on a regular basis. Some may be minor while others can cause huge impact on a player’s career and earning potential. In soccer, there is a high risk of players picking up injuries during game time. This research work seeks to help soccer players reduce the risk of getting injured by predicting the likelihood of injury while playing in the near future and then providing recommendations for intervention. The injury prediction tool will use a soccer player’s number of minutes played on the field, number of appearances, distance covered and performance data for the current and previous seasons as variables to conduct statistical analysis and provide injury predictive results using a machine learning linear regression model.

The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation

Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.

Machine Learning Based Approach for Measuring Promotion Effectiveness in Multiple Parallel Promotions’ Scenarios

Promotion is a key element in the retail business. Thus, analysis of promotions to quantify their effectiveness in terms of Revenue and/or Margin is an essential activity in the retail industry. However, measuring the sales/revenue uplift is based on estimations, as the actual sales/revenue without the promotion is not present. Further, the presence of Halo and Cannibalization in a multiple parallel promotions’ scenario complicates the problem. Calculating Baseline by considering inter-brand/competitor items or using Halo and Cannibalization's impact on Revenue calculations by considering Baseline as an interpretation of items’ unit sales in neighboring nonpromotional weeks individually may not capture the overall Revenue uplift in the case of multiple parallel promotions. Hence, this paper proposes a Machine Learning based method for calculating the Revenue uplift by considering the Halo and Cannibalization impact on the Baseline and the Revenue. In the first section of the proposed methodology, Baseline of an item is calculated by incorporating the impact of the promotions on its related items. In the later section, the Revenue of an item is calculated by considering both Halo and Cannibalization impacts. Hence, this methodology enables correct calculation of the overall Revenue uplift due a given promotion.

Exploring the Effect of Accounting Information on Systematic Risk: An Empirical Evidence of Tehran Stock Exchange

This paper highlights the empirical results of analyzing the correlation between accounting information and systematic risk. This association is analyzed among financial ratios and systematic risk by considering the financial statement of 39 companies listed on the Tehran Stock Exchange (TSE) for five years (2014-2018). Financial ratios have been categorized into four groups and to describe the special features, as representative of accounting information we selected: Return on Asset (ROA), Debt Ratio (Total Debt to Total Asset), Current Ratio (current assets to current debt), Asset Turnover (Net sales to Total assets), and Total Assets. The hypotheses were tested through simple and multiple linear regression and T-student test. The findings illustrate that there is no significant relationship between accounting information and market risk. This indicates that in the selected sample, historical accounting information does not fully reflect the price of stocks.

Prevalence, Associated Factors, and Help-Seeking Behavior of Psychological Distress among International Students at the National University of Malaysia

Depression, anxiety, and stress are associated with decreased role functioning, productivity, and quality of life. International students are more prone to psychological distress as they face many stressors while studying abroad. The objectives of the study were to determine the prevalence and associated factors of depression, anxiety, and stress among international students, their help-seeking behavior, and their awareness of the available on-campus mental support services. A cross-sectional study with a purposive sampling method was performed on 280 international students at Universiti Kebangsaan Malaysia (UKM) between the age of 18 and 35 years. The Depression Anxiety Stress Scale-21 (DASS-21) questionnaire was used anonymously to assess the mental health of students. Socio-demographic, help-seeking behavior, and awareness data were obtained. Independent sample t-test, one-way ANOVA test, and multiple linear regression were used to explore associated factors. The overall prevalence of depression, anxiety, and stress among international students were 58.9%, 71.8%, and 53.9%, respectively. Age was significantly associated with depression and anxiety. Ethnicity showed a significant association with depression and stress. No other factors were found to be significantly associated with psychological distress. Only 9.6% of the international students had sought help from on-campus mental support services. Students who were aware of the presence of such services were only 21.4% of the participants. In conclusion, this study addressed the gap in the literature on the mental health of international students and provided data that could be used in intervention programs to improve the mental health of the increasing number of international students in Malaysia.

Liquid Chromatography Microfluidics for Detection and Quantification of Urine Albumin Using Linear Regression Method

Nearly a hundred per million of the Filipino population is diagnosed with Chronic Kidney Disease (CKD). The early stage of CKD has no symptoms and can only be discovered once the patient undergoes urinalysis. Over the years, different methods were discovered and used for the quantification of the urinary albumin such as the immunochemical assays where most of these methods require large machinery that has a high cost in maintenance and resources, and a dipstick test which is yet to be proven and is still debated as a reliable method in detecting early stages of microalbuminuria. This research study involves the use of the liquid chromatography concept in microfluidic instruments with biosensor as a means of separation and detection respectively, and linear regression to quantify human urinary albumin. The researchers’ main objective was to create a miniature system that quantifies and detect patients’ urinary albumin while reducing the amount of volume used per five test samples. For this study, 30 urine samples of unknown albumin concentrations were tested using VITROS Analyzer and the microfluidic system for comparison. Based on the data shared by both methods, the actual vs. predicted regression were able to create a positive linear relationship with an R2 of 0.9995 and a linear equation of y = 1.09x + 0.07, indicating that the predicted values and actual values are approximately equal. Furthermore, the microfluidic instrument uses 75% less in total volume – sample and reagents combined, compared to the VITROS Analyzer per five test samples.

The Influence of Interest, Beliefs, and Identity with Mathematics on Achievement

This study investigated factors that influence mathematics achievement based on a sample of ninth-grade students (N  =  21,444) from the High School Longitudinal Study of 2009 (HSLS09). Key aspects studied included efficacy in mathematics, interest and enjoyment of mathematics, identity with mathematics and future utility beliefs and how these influence mathematics achievement. The predictability of mathematics achievement based on these factors was assessed using correlation coefficients and multiple linear regression. Spearman rank correlations and multiple regression analyses indicated positive and statistically significant relationships between the explanatory variables: mathematics efficacy, identity with mathematics, interest in and future utility beliefs with the response variable, achievement in mathematics.

A Multiple Linear Regression Model to Predict the Price of Cement in Nigeria

This study investigated factors affecting the price of cement in Nigeria, and developed a mathematical model that can predict future cement prices. Cement is key in the Nigerian construction industry. The changes in price caused by certain factors could affect economic and infrastructural development; hence there is need for proper proactive planning. Secondary data were collected from published information on cement between 2014 and 2019. In addition, questionnaires were sent to some domestic cement retailers in Port Harcourt in Nigeria, to obtain the actual prices of cement between the same periods. The study revealed that the most critical factors affecting the price of cement in Nigeria are inflation rate, population growth rate, and Gross Domestic Product (GDP) growth rate. With the use of data from United Nations, International Monetary Fund, and Central Bank of Nigeria databases, amongst others, a Multiple Linear Regression model was formulated. The model was used to predict the price of cement for 2020-2025. The model was then tested with 95% confidence level, using a two-tailed t-test and an F-test, resulting in an R2 of 0.8428 and R2 (adj.) of 0.6069. The results of the tests and the correlation factors confirm the model to be fit and adequate. This study will equip researchers and stakeholders in the construction industry with information for planning, monitoring, and management of present and future construction projects that involve the use of cement.

The Association between Affective States and Sexual/Health-Related Status among Men Who Have Sex with Men in China: An Exploration Study Using Social Media Data

Objectives: The purpose of this study was to understand and examine the association between diurnal mood variation and sexual/health-related status among men who have sex with men (MSM) using data from MSM Chinese Twitter messages. The study consists of 843,745 postings of 377,610 MSM users located in Guangdong that were culled from the MSM Chinese Twitter App. Positive affect, negative affect, sexual related behaviors, and health-related status were measured using the Simplified Chinese Linguistic Inquiry and Word Count. Emotions, including joy, sadness, anger, fear, and disgust were measured using the Weibo Basic Mood Lexicon. A positive sentiment score and a positive emotions score were also calculated. Linear regression models based on a permutation test were used to assess associations between affective states and sexual/health-related status. In the results, 5,871 active MSM users and their 477,374 postings were finally selected. MSM expressed positive affect and joy at 8 a.m. and expressed negative affect and negative emotions between 2 a.m. and 4 a.m. In addition, 25.1% of negative postings were directly related to health and 13.4% reported seeking social support during that sensitive period. MSM who were senior, educated, overweight or obese, self-identified as performing a versatile sex role, and with less followers, more followers, and less chat groups mainly expressed more negative affect and negative emotions. MSM who talked more about sexual-related behaviors had a higher positive sentiment score (β=0.29, p < 0.001) and a higher positive emotions score (β = 0.16, p < 0.001). MSM who reported more on their health status had a lower positive sentiment score (β = -0.83, p < 0.001) and a lower positive emotions score (β = -0.37, p < 0.001). The study concluded that psychological intervention based on an app for MSM should be conducted, as it may improve mental health.

A Data Driven Approach for the Degradation of a Lithium-Ion Battery Based on Accelerated Life Test

Lithium ion batteries are currently used for many applications including satellites, electric vehicles and mobile electronics. Their ability to store relatively large amount of energy in a limited space make them most appropriate for critical applications. Evaluation of the life of these batteries and their reliability becomes crucial to the systems they support. Reliability of Li-Ion batteries has been mainly considered based on its lifetime. However, another important factor that can be considered critical in many applications such as in electric vehicles is the cycle duration. The present work presents the results of an experimental investigation on the degradation behavior of a Laptop Li-ion battery (type TKV2V) and the effect of applied load on the battery cycle time. The reliability was evaluated using an accelerated life test. Least squares linear regression with median rank estimation was used to estimate the Weibull distribution parameters needed for the reliability functions estimation. The probability density function, failure rate and reliability function under each of the applied loads were evaluated and compared. An inverse power model is introduced that can predict cycle time at any stress level given.

Evaluation of the Weight-Based and Fat-Based Indices in Relation to Basal Metabolic Rate-to-Weight Ratio

Basal metabolic rate is questioned as a risk factor for weight gain. The relations between basal metabolic rate and body composition have not been cleared yet. The impact of fat mass on basal metabolic rate is also uncertain. Within this context, indices based upon total body mass as well as total body fat mass are available. In this study, the aim is to investigate the potential clinical utility of these indices in the adult population. 287 individuals, aged from 18 to 79 years, were included into the scope of the study. Based upon body mass index values, 10 underweight, 88 normal, 88 overweight, 81 obese, and 20 morbid obese individuals participated. Anthropometric measurements including height (m), and weight (kg) were performed. Body mass index, diagnostic obesity notation model assessment index I, diagnostic obesity notation model assessment index II, basal metabolic rate-to-weight ratio were calculated. Total body fat mass (kg), fat percent (%), basal metabolic rate, metabolic age, visceral adiposity, fat mass of upper as well as lower extremities and trunk, obesity degree were measured by TANITA body composition monitor using bioelectrical impedance analysis technology. Statistical evaluations were performed by statistical package (SPSS) for Windows Version 16.0. Scatterplots of individual measurements for the parameters concerning correlations were drawn. Linear regression lines were displayed. The statistical significance degree was accepted as p < 0.05. The strong correlations between body mass index and diagnostic obesity notation model assessment index I as well as diagnostic obesity notation model assessment index II were obtained (p < 0.001). A much stronger correlation was detected between basal metabolic rate and diagnostic obesity notation model assessment index I in comparison with that calculated for basal metabolic rate and body mass index (p < 0.001). Upon consideration of the associations between basal metabolic rate-to-weight ratio and these three indices, the best association was observed between basal metabolic rate-to-weight and diagnostic obesity notation model assessment index II. In a similar manner, this index was highly correlated with fat percent (p < 0.001). Being independent of the indices, a strong correlation was found between fat percent and basal metabolic rate-to-weight ratio (p < 0.001). Visceral adiposity was much strongly correlated with metabolic age when compared to that with chronological age (p < 0.001). In conclusion, all three indices were associated with metabolic age, but not with chronological age. Diagnostic obesity notation model assessment index II values were highly correlated with body mass index values throughout all ranges starting with underweight going towards morbid obesity. This index is the best in terms of its association with basal metabolic rate-to-weight ratio, which can be interpreted as basal metabolic rate unit.

Blood Glucose Level Measurement from Breath Analysis

The constant monitoring of blood glucose level is necessary for maintaining health of patients and to alert medical specialists to take preemptive measures before the onset of any complication as a result of diabetes. The current clinical monitoring of blood glucose uses invasive methods repeatedly which are uncomfortable and may result in infections in diabetic patients. Several attempts have been made to develop non-invasive techniques for blood glucose measurement. In this regard, the existing methods are not reliable and are less accurate. Other approaches claiming high accuracy have not been tested on extended dataset, and thus, results are not statistically significant. It is a well-known fact that acetone concentration in breath has a direct relation with blood glucose level. In this paper, we have developed the first of its kind, reliable and high accuracy breath analyzer for non-invasive blood glucose measurement. The acetone concentration in breath was measured using MQ 138 sensor in the samples collected from local hospitals in Pakistan involving one hundred patients. The blood glucose levels of these patients are determined using conventional invasive clinical method. We propose a linear regression classifier that is trained to map breath acetone level to the collected blood glucose level achieving high accuracy.

Integrated Mass Rapid Transit System for Smart City Project in Western India

This paper is an attempt to develop an Integrated Mass Rapid Transit System (MRTS) for a smart city project in Western India. Integrated transportation is one of the enablers of smart transportation for providing a seamless intercity as well as regional level transportation experience. The success of a smart city project at the city level for transportation is providing proper integration to different mass rapid transit modes by way of integrating information, physical, network of routes fares, etc. The methodology adopted for this study was primary data research through questionnaire survey. The respondents of the questionnaire survey have responded on the issues about their perceptions on the ways and means to improve public transport services in urban cities. The respondents were also required to identify the factors and attributes which might motivate more people to shift towards the public mode. Also, the respondents were questioned about the factors which they feel might restrain the integration of various modes of MRTS. Furthermore, this study also focuses on developing a utility equation for respondents with the help of multiple linear regression analysis and its probability to shift to public transport for certain factors listed in the questionnaire. It has been observed that for shifting to public transport, the most important factors that need to be considered were travel time saving and comfort rating. Also, an Integrated MRTS can be obtained by combining metro rail with BRTS, metro rail with monorail, monorail with BRTS and metro rail with Indian railways. Providing a common smart card to transport users for accessing all the different available modes would be a pragmatic solution towards integration of the available modes of MRTS.

Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison

Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.

Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression

Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.

Analysis on Precipitation Variation Patterns of Chenzhou City

By using linear regression methodology to analyze the data of daily precipitation from 1961-2012, this paper studied the variation tendency of precipitation in Chenzhou. The outcome showed: (1) The annual precipitation was decreasing for 52 years and the difference of precipitation variation tendency among four seasons was remarkable. The precipitation of spring and autumn showed more remarkable decrease than of summer; but the precipitation of winter significantly increased. (2) The annual precipitation frequency tended to lower, which was consistent with the tendency of yearly variation. The seasonal precipitation frequency was greatly different, namely, precipitation frequency in spring and autumn decreased, co-occurring with the phenomenon of mutation; but the winter precipitation frequency increased notably. (3) The precipitation intensity displayed a tendency of increase, including spring, autumn and winter; among them, winter had the most obvious tendency to increase, and autumn had the most yearly variation. Summer was the only season with a tendency of decreasing in precipitation intensity. (4) Annual extreme precipitation tended to reduce, spring, summer and autumn are all included; whereas, winter extreme precipitation tended to increase at the rate of 0.1d/10a. (5) The daily maximum precipitation intensity increased slightly and it varied greatly.

Analysis on the Feasibility of Landsat 8 Imagery for Water Quality Parameters Assessment in an Oligotrophic Mediterranean Lake

Lake water quality monitoring in combination with the use of earth observation products constitutes a major component in many water quality monitoring programs. Landsat 8 images of Trichonis Lake (Greece) acquired on 30/10/2013 and 30/08/2014 were used in order to explore the possibility of Landsat 8 to estimate water quality parameters and particularly CDOM absorption at specific wavelengths, chlorophyll-a and nutrient concentrations in this oligotrophic freshwater body, characterized by inexistent quantitative, temporal and spatial variability. Water samples have been collected at 22 different stations, on late August of 2014 and the satellite image of the same date was used to statistically correlate the in-situ measurements with various combinations of Landsat 8 bands in order to develop algorithms that best describe those relationships and calculate accurately the aforementioned water quality components. Optimal models were applied to the image of late October of 2013 and the validation of the results was conducted through their comparison with the respective available in-situ data of 2013. Initial results indicated the limited ability of the Landsat 8 sensor to accurately estimate water quality components in an oligotrophic waterbody. As resulted by the validation process, ammonium concentrations were proved to be the most accurately estimated component (R = 0.7), followed by chl-a concentration (R = 0.5) and the CDOM absorption at 420 nm (R = 0.3). In-situ nitrate, nitrite, phosphate and total nitrogen concentrations of 2014 were measured as lower than the detection limit of the instrument used, hence no statistical elaboration was conducted. On the other hand, multiple linear regression among reflectance measures and total phosphorus concentrations resulted in low and statistical insignificant correlations. Our results were concurrent with other studies in international literature, indicating that estimations for eutrophic and mesotrophic lakes are more accurate than oligotrophic, owing to the lack of suspended particles that are detectable by satellite sensors. Nevertheless, although those predictive models, developed and applied to Trichonis oligotrophic lake are less accurate, may still be useful indicators of its water quality deterioration.

Spatial Distribution of Local Sheep Breeds in Antalya Province

Sheep breeding is important in terms of meeting both the demand of red meat consumption and the availability of industrial raw materials and the employment of the rural sector in Turkey. It is also very important to ensure the selection and continuity of the breeds that are raised in order to increase quality and productive products related to sheep breeding. The protection of local breeds and crossbreds also enables the development of the sector in the region and the reduction of imports. In this study, the data were obtained from the records of the Turkish Statistical Institute and Antalya Sheep & Goat Breeders' Association. Spatial distribution of sheep breeds in Antalya is reviewed statistically in terms of concentration at the local level for 2015 period spatially. For this reason; mapping, box plot, linear regression are used in this study. Concentration is introduced by means of studbook data on sheep breeding as locals and total sheep farm by mapping. It is observed that Pırlak breed (17.5%) and Merinos crossbreed (16.3%) have the highest concentration in the region. These breeds are respectively followed by Akkaraman breed (11%), Pirlak crossbreed (8%), Merinos breed (7.9%) Akkaraman crossbreed (7.9%) and Ivesi breed (7.2%).

Deterioration Assessment Models for Water Pipelines

The aging and deterioration of water pipelines in cities worldwide result in more frequent water main breaks, water service disruptions, and flooding damage. Therefore, there is an urgent need for undertaking proper maintenance procedures to avoid breaks and disastrous failures. However, due to budget limitations, the maintenance of water pipeline networks needs to be prioritized through efficient deterioration assessment models. Previous studies focused on the development of structural or physical deterioration assessment models, which require expensive inspection data. But, this paper aims at developing deterioration assessment models for water pipelines using statistical techniques. Several deterioration models were developed based on pipeline size, material type, and soil type using linear regression analysis. The categorical nature of some variables affecting pipeline deterioration was considered through developing several categorical models. The developed models were validated with an average validity percentage greater than 95%. Moreover, sensitivity analysis was carried out against different classifications and it displayed higher importance of age of pipes compared to other factors. The developed models will be helpful for the water municipalities and asset managers to assess the condition of their pipes and prioritize them for maintenance and inspection purposes.