A Character Detection Method for Ancient Yi Books Based on Connected Components and Regressive Character Segmentation

Character detection is an important issue for character recognition of ancient Yi books. The accuracy of detection directly affects the recognition effect of ancient Yi books. Considering the complex layout, the lack of standard typesetting and the mixed arrangement between images and texts, we propose a character detection method for ancient Yi books based on connected components and regressive character segmentation. First, the scanned images of ancient Yi books are preprocessed with nonlocal mean filtering, and then a modified local adaptive threshold binarization algorithm is used to obtain the binary images to segment the foreground and background for the images. Second, the non-text areas are removed by the method based on connected components. Finally, the single character in the ancient Yi books is segmented by our method. The experimental results show that the method can effectively separate the text areas and non-text areas for ancient Yi books and achieve higher accuracy and recall rate in the experiment of character detection, and effectively solve the problem of character detection and segmentation in character recognition of ancient books.

The Influence of Interest, Beliefs, and Identity with Mathematics on Achievement

This study investigated factors that influence mathematics achievement based on a sample of ninth-grade students (N  =  21,444) from the High School Longitudinal Study of 2009 (HSLS09). Key aspects studied included efficacy in mathematics, interest and enjoyment of mathematics, identity with mathematics and future utility beliefs and how these influence mathematics achievement. The predictability of mathematics achievement based on these factors was assessed using correlation coefficients and multiple linear regression. Spearman rank correlations and multiple regression analyses indicated positive and statistically significant relationships between the explanatory variables: mathematics efficacy, identity with mathematics, interest in and future utility beliefs with the response variable, achievement in mathematics.

Monitoring Blood Pressure Using Regression Techniques

Blood pressure helps the physicians greatly to have a deep insight into the cardiovascular system. The determination of individual blood pressure is a standard clinical procedure considered for cardiovascular system problems. The conventional techniques to measure blood pressure (e.g. cuff method) allows a limited number of readings for a certain period (e.g. every 5-10 minutes). Additionally, these systems cause turbulence to blood flow; impeding continuous blood pressure monitoring, especially in emergency cases or critically ill persons. In this paper, the most important statistical features in the photoplethysmogram (PPG) signals were extracted to estimate the blood pressure noninvasively. PPG signals from more than 40 subjects were measured and analyzed and 12 features were extracted. The features were fed to principal component analysis (PCA) to find the most important independent features that have the highest correlation with blood pressure. The results show that the stiffness index means and standard deviation for the beat-to-beat heart rate were the most important features. A model representing both features for Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) was obtained using a statistical regression technique. Surface fitting is used to best fit the series of data and the results show that the error value in estimating the SBP is 4.95% and in estimating the DBP is 3.99%.

A Multiple Linear Regression Model to Predict the Price of Cement in Nigeria

This study investigated factors affecting the price of cement in Nigeria, and developed a mathematical model that can predict future cement prices. Cement is key in the Nigerian construction industry. The changes in price caused by certain factors could affect economic and infrastructural development; hence there is need for proper proactive planning. Secondary data were collected from published information on cement between 2014 and 2019. In addition, questionnaires were sent to some domestic cement retailers in Port Harcourt in Nigeria, to obtain the actual prices of cement between the same periods. The study revealed that the most critical factors affecting the price of cement in Nigeria are inflation rate, population growth rate, and Gross Domestic Product (GDP) growth rate. With the use of data from United Nations, International Monetary Fund, and Central Bank of Nigeria databases, amongst others, a Multiple Linear Regression model was formulated. The model was used to predict the price of cement for 2020-2025. The model was then tested with 95% confidence level, using a two-tailed t-test and an F-test, resulting in an R2 of 0.8428 and R2 (adj.) of 0.6069. The results of the tests and the correlation factors confirm the model to be fit and adequate. This study will equip researchers and stakeholders in the construction industry with information for planning, monitoring, and management of present and future construction projects that involve the use of cement.

The Non-Stationary BINARMA(1,1) Process with Poisson Innovations: An Application on Accident Data

This paper considers the modelling of a non-stationary bivariate integer-valued autoregressive moving average of order one (BINARMA(1,1)) with correlated Poisson innovations. The BINARMA(1,1) model is specified using the binomial thinning operator and by assuming that the cross-correlation between the two series is induced by the innovation terms only. Based on these assumptions, the non-stationary marginal and joint moments of the BINARMA(1,1) are derived iteratively by using some initial stationary moments. As regards to the estimation of parameters of the proposed model, the conditional maximum likelihood (CML) estimation method is derived based on thinning and convolution properties. The forecasting equations of the BINARMA(1,1) model are also derived. A simulation study is also proposed where BINARMA(1,1) count data are generated using a multivariate Poisson R code for the innovation terms. The performance of the BINARMA(1,1) model is then assessed through a simulation experiment and the mean estimates of the model parameters obtained are all efficient, based on their standard errors. The proposed model is then used to analyse a real-life accident data on the motorway in Mauritius, based on some covariates: policemen, daily patrol, speed cameras, traffic lights and roundabouts. The BINARMA(1,1) model is applied on the accident data and the CML estimates clearly indicate a significant impact of the covariates on the number of accidents on the motorway in Mauritius. The forecasting equations also provide reliable one-step ahead forecasts.

An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Development of a Real Time Axial Force Measurement System and IoT-Based Monitoring for Smart Bearing

The purpose of this research is to develop a real time axial force measurement system for a smart bearing through the use of strain-gauges, whereby the data acquisition is performed by an Arduino microcontroller due to its easy manipulation and low-cost. The measured signal is acquired and then discretized using a Wheatstone Bridge and an Analog-Digital Converter (ADC) respectively. For bearing monitoring, a real time monitoring system based on Internet of things (IoT) and Bluetooth were developed. Experimental tests were performed on a bearing within a force range up to 600 kN. The experimental results show that there is a proportional linear relationship between the applied force and the output voltage, and the error R squared is within 0.9878 based on the regression analysis.

Bio-Psycho-Social Consequences and Effects in Fall-Efficacy Scale in Seniors Using Exercise Intervention of Motor Learning According to Yoga Techniques

The paper declares effects of exercise intervention of the research project “Basic research of balance changes in seniors”, granted by the Czech Science Foundation. The objective of the presented study is to define predictors, which influence bio-psycho-social consequences and effects of balance ability in senior 65 years old and above. We focused on the Fall-Efficacy Scale changes evaluation in seniors. Comprehensive hypothesis of the project declares, that motion uncertainty (dyskinesia) can negatively affect the well-being of a senior in bio-psycho-social context. In total, random selection and testing of 100 seniors (30 males, 70 females) from Prague and Central Bohemian region was provided. The sample was divided by stratified random selection into experimental and control groups, who underwent input and output testing. For diagnostics the methods of Medical Anamnesis, Functional anthropological examinations, Tinetti Balance Assessment Tool, SF-36 Health Survey, Anamnestic comparative self-assessment scale were used. Intervention method called "Life in Balance" based on yoga techniques was applied in four-week cycle. Results of multivariate regression were verified by repeated measures ANOVA: subject factor, phase of intervention (between-subject factor), body fluid (within-subject factor) and phase of intervention × body fluid interaction). ANOVA was performed with a repetition involving the factors of subjects, experimental/control group, phase of intervention (independent variable), and x phase interaction followed by Bonferroni multiple comparison assays with a test strength of at least 0.8 on the probability level p < 0.05. In the paper results of the first-year investigation of the three years running project are analysed. Results of balance tests confirmed no significant difference between females and males in pre-test. Significant improvements in balance and walking ability were observed in experimental group in females comparing to males (F = 128.4, p < 0.001). In the females control group, there was no significant change in post- test, while in the female experimental group positive changes in posture and spine flexibility in post-tests were found. It seems that females even in senior age react better to incentives of intervention in balance and spine flexibility. On the base of results analyses, we can declare the significant improvement in social balance markers after intervention in the experimental group (F = 10.5, p < 0.001). In average, seniors are used to take four drugs daily. Number of drugs can contribute to allergy symptoms and balance problems. It can be concluded that static balance and walking ability of seniors according Tinetti Balance scale correlate significantly with psychic and social monitored markers.

Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Monomial Form Approach to Rectangular Surface Modeling

Geometric modeling plays an important role in the constructions and manufacturing of curve, surface and solid modeling. Their algorithms are critically important not only in the automobile, ship and aircraft manufacturing business, but are also absolutely necessary in a wide variety of modern applications, e.g., robotics, optimization, computer vision, data analytics and visualization. The calculation and display of geometric objects can be accomplished by these six techniques: Polynomial basis, Recursive, Iterative, Coefficient matrix, Polar form approach and Pyramidal algorithms. In this research, the coefficient matrix (simply called monomial form approach) will be used to model polynomial rectangular patches, i.e., Said-Ball, Wang-Ball, DP, Dejdumrong and NB1 surfaces. Some examples of the monomial forms for these surface modeling are illustrated in many aspects, e.g., construction, derivatives, model transformation, degree elevation and degress reduction.

Clinical Utility of Salivary Cytokines for Children with Attention Deficit Hyperactivity Disorder

The goal of this study was to examine the possibility of salivary cytokines for the screening of attention deficit hyperactivity disorder (ADHD) in children. We carried out a case-control study, including 19 children with ADHD and 17 healthy children (controls). A multiplex bead array immunoassay was used to conduct a multi-analysis of 27 different salivary cytokines. Six salivary cytokines (interleukin (IL)-1β, IL-8, IL12p70, granulocyte colony-stimulating factor (G-CSF), interferon gamma (IFN-γ), and vascular endothelial growth factor (VEGF)) were significantly associated with the presence of ADHD (p < 0.05). An informative salivary cytokine panel was developed using VEGF by logistic regression analysis (odds ratio: 0.251). Receiver operating characteristic analysis revealed that assessment of a panel using VEGF showed “good” capability for discriminating between ADHD patients and controls (area under the curve: 0.778). ADHD has been hypothesized to be associated with reduced cerebral blood flow in the frontal cortex, due to reduced VEGF levels. Our study highlights the possibility of utilizing differential salivary cytokine levels for point-of-care testing (POCT) of biomarkers in children with ADHD.

Effects of Polyvictimization in Suicidal Ideation among Children and Adolescents in Chile

In Chile, there is a lack of evidence about the impact of polyvictimization on the emergence of suicidal thoughts among children and young people. Thus, this study aims to explore the association between the episodes of polyvictimization suffered by Chilean children and young people and the manifestation of signs related to suicidal tendencies. To achieve this purpose, secondary data from the First Polyvictimization Survey on Children and Adolescents of 2017 were analyzed, and a binomial logistic regression model was applied to establish the probability that young people are experiencing suicidal ideation episodes. The main findings show that women between the ages of 13 and 15 years, who are in seventh grade and second in subsidized schools, are more likely to express suicidal ideas, which increases if they have suffered different types of victimization, particularly physical violence, psychological aggression, and sexual abuse.

Architecture Performance-Related Design Based on Graphic Parameterization

Architecture plane form is an important consideration in the design of green buildings due to its significant impact on energy performance. The most effective method to consider energy performance in the early design stages is parametric modelling. This paper presents a methodology to program plane forms using MATLAB language, generating 16 kinds of plane forms by changing four designed parameters. DesignBuilder (an energy consumption simulation software) was proposed to simulate the energy consumption of the generated planes. A regression mathematical model was established to study the relationship between the plane forms and their energy consumption. The main finding of the study suggested that there was a cubic function relationship between the depth-ratio of U-shaped buildings and energy consumption, and there is also a cubic function relationship between the width-ratio and energy consumption. In the design, the depth-ratio of U-shaped buildings should not be less than 2.5, and the width-ratio should not be less than 2.

The Association between Affective States and Sexual/Health-Related Status among Men Who Have Sex with Men in China: An Exploration Study Using Social Media Data

Objectives: The purpose of this study was to understand and examine the association between diurnal mood variation and sexual/health-related status among men who have sex with men (MSM) using data from MSM Chinese Twitter messages. The study consists of 843,745 postings of 377,610 MSM users located in Guangdong that were culled from the MSM Chinese Twitter App. Positive affect, negative affect, sexual related behaviors, and health-related status were measured using the Simplified Chinese Linguistic Inquiry and Word Count. Emotions, including joy, sadness, anger, fear, and disgust were measured using the Weibo Basic Mood Lexicon. A positive sentiment score and a positive emotions score were also calculated. Linear regression models based on a permutation test were used to assess associations between affective states and sexual/health-related status. In the results, 5,871 active MSM users and their 477,374 postings were finally selected. MSM expressed positive affect and joy at 8 a.m. and expressed negative affect and negative emotions between 2 a.m. and 4 a.m. In addition, 25.1% of negative postings were directly related to health and 13.4% reported seeking social support during that sensitive period. MSM who were senior, educated, overweight or obese, self-identified as performing a versatile sex role, and with less followers, more followers, and less chat groups mainly expressed more negative affect and negative emotions. MSM who talked more about sexual-related behaviors had a higher positive sentiment score (β=0.29, p < 0.001) and a higher positive emotions score (β = 0.16, p < 0.001). MSM who reported more on their health status had a lower positive sentiment score (β = -0.83, p < 0.001) and a lower positive emotions score (β = -0.37, p < 0.001). The study concluded that psychological intervention based on an app for MSM should be conducted, as it may improve mental health.

A Data Driven Approach for the Degradation of a Lithium-Ion Battery Based on Accelerated Life Test

Lithium ion batteries are currently used for many applications including satellites, electric vehicles and mobile electronics. Their ability to store relatively large amount of energy in a limited space make them most appropriate for critical applications. Evaluation of the life of these batteries and their reliability becomes crucial to the systems they support. Reliability of Li-Ion batteries has been mainly considered based on its lifetime. However, another important factor that can be considered critical in many applications such as in electric vehicles is the cycle duration. The present work presents the results of an experimental investigation on the degradation behavior of a Laptop Li-ion battery (type TKV2V) and the effect of applied load on the battery cycle time. The reliability was evaluated using an accelerated life test. Least squares linear regression with median rank estimation was used to estimate the Weibull distribution parameters needed for the reliability functions estimation. The probability density function, failure rate and reliability function under each of the applied loads were evaluated and compared. An inverse power model is introduced that can predict cycle time at any stress level given.

Monte Carlo Estimation of Heteroscedasticity and Periodicity Effects in a Panel Data Regression Model

This research attempts to investigate the effects of heteroscedasticity and periodicity in a Panel Data Regression Model (PDRM) by extending previous works on balanced panel data estimation within the context of fitting PDRM for Banks audit fee. The estimation of such model was achieved through the derivation of Joint Lagrange Multiplier (LM) test for homoscedasticity and zero-serial correlation, a conditional LM test for zero serial correlation given heteroscedasticity of varying degrees as well as conditional LM test for homoscedasticity given first order positive serial correlation via a two-way error component model. Monte Carlo simulations were carried out for 81 different variations, of which its design assumed a uniform distribution under a linear heteroscedasticity function. Each of the variation was iterated 1000 times and the assessment of the three estimators considered are based on Variance, Absolute bias (ABIAS), Mean square error (MSE) and the Root Mean Square (RMSE) of parameters estimates. Eighteen different models at different specified conditions were fitted, and the best-fitted model is that of within estimator when heteroscedasticity is severe at either zero or positive serial correlation value. LM test results showed that the tests have good size and power as all the three tests are significant at 5% for the specified linear form of heteroscedasticity function which established the facts that Banks operations are severely heteroscedastic in nature with little or no periodicity effects.

Scope, Relevance and Sustainability of Decentralized Renewable Energy Systems in Developing Economies: Imperatives from Indian Case Studies

‘Energy for all’, is a global issue of concern for the past many years. Despite the number of technological advancements and innovations, significant numbers of people are living without access to electricity around the world. India, an emerging economy, tops the list of nations having the maximum number of residents living off the grid, thus raising global attention in past few years to provide clean and sustainable energy access solutions to all of its residents. It is evident from developed economies that centralized planning and electrification alone is not sufficient for meeting energy security. Implementation of off-grid and consumer-driven energy models like Decentralized Renewable Energy (DRE) systems have played a significant role in meeting the national energy demand in developed nations. Cases of DRE systems have been reported in developing countries like India for the past few years. This paper attempts to profile the status of DRE projects in the Indian context with their scope and relevance to ensure universal electrification. Diversified cases of DRE projects, particularly solar, biomass and micro hydro are identified in different Indian states. Critical factors affecting the sustainability of DRE projects are extracted with their interlinkages in the context of developers, beneficiaries and promoters involved in such projects. Socio-techno-economic indicators are identified through similar cases in the context of DRE projects. Exploratory factor analysis is performed to evaluate the critical sustainability factors followed by regression analysis to establish the relationship between the dependent and independent factors. The generated EFA-Regression model provides a basis to develop the sustainability and replicability framework for broader coverage of DRE projects in developing nations in order to attain the goal of universal electrification with least carbon emissions.

A 1H NMR-Linked PCR Modelling Strategy for Tracking the Fatty Acid Sources of Aldehydic Lipid Oxidation Products in Culinary Oils Exposed to Simulated Shallow-Frying Episodes

Objectives/Hypotheses: The adverse health effect potential of dietary lipid oxidation products (LOPs) has evoked much clinical interest. Therefore, we employed a 1H NMR-linked Principal Component Regression (PCR) chemometrics modelling strategy to explore relationships between data matrices comprising (1) aldehydic LOP concentrations generated in culinary oils/fats when exposed to laboratory-simulated shallow frying practices, and (2) the prior saturated (SFA), monounsaturated (MUFA) and polyunsaturated fatty acid (PUFA) contents of such frying media (FM), together with their heating time-points at a standard frying temperature (180 oC). Methods: Corn, sunflower, extra virgin olive, rapeseed, linseed, canola, coconut and MUFA-rich algae frying oils, together with butter and lard, were heated according to laboratory-simulated shallow-frying episodes at 180 oC, and FM samples were collected at time-points of 0, 5, 10, 20, 30, 60, and 90 min. (n = 6 replicates per sample). Aldehydes were determined by 1H NMR analysis (Bruker AV 400 MHz spectrometer). The first (dependent output variable) PCR data matrix comprised aldehyde concentration scores vectors (PC1* and PC2*), whilst the second (predictor) one incorporated those from the fatty acid content/heating time variables (PC1-PC4) and their first-order interactions. Results: Structurally complex trans,trans- and cis,trans-alka-2,4-dienals, 4,5-epxy-trans-2-alkenals and 4-hydroxy-/4-hydroperoxy-trans-2-alkenals (group I aldehydes predominantly arising from PUFA peroxidation) strongly and positively loaded on PC1*, whereas n-alkanals and trans-2-alkenals (group II aldehydes derived from both MUFA and PUFA hydroperoxides) strongly and positively loaded on PC2*. PCR analysis of these scores vectors (SVs) demonstrated that PCs 1 (positively-loaded linoleoylglycerols and [linoleoylglycerol]:[SFA] content ratio), 2 (positively-loaded oleoylglycerols and negatively-loaded SFAs), 3 (positively-loaded linolenoylglycerols and [PUFA]:[SFA] content ratios), and 4 (exclusively orthogonal sampling time-points) all powerfully contributed to aldehydic PC1* SVs (p 10-3 to < 10-9), as did all PC1-3 x PC4 interaction ones (p 10-5 to < 10-9). PC2* was also markedly dependent on all the above PC SVs (PC2 > PC1 and PC3), and the interactions of PC1 and PC2 with PC4 (p < 10-9 in each case), but not the PC3 x PC4 contribution. Conclusions: NMR-linked PCR analysis is a valuable strategy for (1) modelling the generation of aldehydic LOPs in heated cooking oils and other FM, and (2) tracking their unsaturated fatty acid (UFA) triacylglycerol sources therein.

dynr.mi: An R Program for Multiple Imputation in Dynamic Modeling

Assessing several individuals intensively over time yields intensive longitudinal data (ILD). Even though ILD provide rich information, they also bring other data analytic challenges. One of these is the increased occurrence of missingness with increased study length, possibly under non-ignorable missingness scenarios. Multiple imputation (MI) handles missing data by creating several imputed data sets, and pooling the estimation results across imputed data sets to yield final estimates for inferential purposes. In this article, we introduce dynr.mi(), a function in the R package, Dynamic Modeling in R (dynr). The package dynr provides a suite of fast and accessible functions for estimating and visualizing the results from fitting linear and nonlinear dynamic systems models in discrete as well as continuous time. By integrating the estimation functions in dynr and the MI procedures available from the R package, Multivariate Imputation by Chained Equations (MICE), the dynr.mi() routine is designed to handle possibly non-ignorable missingness in the dependent variables and/or covariates in a user-specified dynamic systems model via MI, with convergence diagnostic check. We utilized dynr.mi() to examine, in the context of a vector autoregressive model, the relationships among individuals’ ambulatory physiological measures, and self-report affect valence and arousal. The results from MI were compared to those from listwise deletion of entries with missingness in the covariates. When we determined the number of iterations based on the convergence diagnostics available from dynr.mi(), differences in the statistical significance of the covariate parameters were observed between the listwise deletion and MI approaches. These results underscore the importance of considering diagnostic information in the implementation of MI procedures.