Using Statistical Significance and Prediction to Test Long/Short Term Public Services and Patients Cohorts: A Case Study in Scotland

Health and Social care (HSc) services planning and scheduling are facing unprecedented challenges, due to the pandemic pressure and also suffer from unplanned spending that is negatively impacted by the global financial crisis. Data-driven approaches can help to improve policies, plan and design services provision schedules using algorithms that assist healthcare managers to face unexpected demands using fewer resources. The paper discusses services packing using statistical significance tests and machine learning (ML) to evaluate demands similarity and coupling. This is achieved by predicting the range of the demand (class) using ML methods such as Classification and Regression Trees (CART), Random Forests (RF), and Logistic Regression (LGR). The significance tests Chi-Squared and Student’s test are used on data over a 39 years span for which data exist for services delivered in Scotland. The demands are associated using probabilities and are parts of statistical hypotheses. These hypotheses, as their NULL part, assume that the target demand is statistically dependent on other services’ demands. This linking is checked using the data. In addition, ML methods are used to linearly predict the above target demands from the statistically found associations and extend the linear dependence of the target’s demand to independent demands forming, thus, groups of services. Statistical tests confirmed ML coupling and made the prediction statistically meaningful and proved that a target service can be matched reliably to other services while ML showed that such marked relationships can also be linear ones. Zero padding was used for missing years records and illustrated better such relationships both for limited years and for the entire span offering long-term data visualizations while limited years periods explained how well patients numbers can be related in short periods of time or that they can change over time as opposed to behaviours across more years. The prediction performance of the associations were measured using metrics such as Receiver Operating Characteristic (ROC), Area Under Curve (AUC) and Accuracy (ACC) as well as the statistical tests Chi-Squared and Student. Co-plots and comparison tables for the RF, CART, and LGR methods as well as the p-value from tests and Information Exchange (IE/MIE) measures are provided showing the relative performance of ML methods and of the statistical tests as well as the behaviour using different learning ratios. The impact of k-neighbours classification (k-NN), Cross-Correlation (CC) and C-Means (CM) first groupings was also studied over limited years and for the entire span. It was found that CART was generally behind RF and LGR but in some interesting cases, LGR reached an AUC = 0 falling below CART, while the ACC was as high as 0.912 showing that ML methods can be confused by zero-padding or by data’s irregularities or by the outliers. On average, 3 linear predictors were sufficient, LGR was found competing well RF and CART followed with the same performance at higher learning ratios. Services were packed only when a significance level (p-value) of their association coefficient was more than 0.05. Social factors relationships were observed between home care services and treatment of old people, low birth weights, alcoholism, drug abuse, and emergency admissions. The work found  that different HSc services can be well packed as plans of limited duration, across various services sectors, learning configurations, as confirmed by using statistical hypotheses.

Obesity and Bone Mineral Density in Patients with Large Joint Osteoarthritis

Along with the global aging of population, the number of people with somatic diseases is increasing, including such interrelated pathologies as obesity, osteoarthritis (OA) and osteoporosis (OP). The objective of the study is to examine the connection between body mass index (BMI), OA and bone mineral density (BMD) of lumbar spine, femoral neck and trabecular bone score (TBS) in postmenopausal women with OA. We have observed 359 postmenopausal women (50-89 years old) and divided them into four groups by age: 50-59 yrs, 60-69 yrs, 70-79 yrs and over 80 years old. In addition, according to the American College of Rheumatology (ACR) Clinical classification criteria for knee and hip OA, we divided them into 2 groups: group I – 117 females with symptomatic OA (including 89 patients with knee OA, 28 patients with hip OA) and group II –242 women with a normal functional activity of large joints. Analysis of data was performed taking into account their BMI, classified by World Health Organization (WHO). Diagnosis of obesity was established when BMI was above 30 kg/m2. In woman with obesity, a symptomatic OA was detected in 44 postmenopausal women (41.1%), a normal functional activity of large joints - in 63 women (58.9%). However, in women with normal BMI – 73 women, who account for 29.0% of cases, a symptomatic OA was detected. According to a chi-squared (χ2) test, a significantly higher level of BMI was detected in postmenopausal women with OA (χ2 = 5.05, p = 0.02). Women with a symptomatic OA had a significantly higher BMD of lumbar spine compared with women who had a normal functional activity of large joints. No significant differences of BMD of femoral necks or TBS were detected in either the group with OA or with a normal functional activity of large joints.

Evaluating the Understanding of the University Students (Basic Sciences and Engineering) about the Numerical Representation of the Average Rate of Change

The present study aimed to evaluate the understanding of the students in Tehran universities (Iran) about the numerical representation of the average rate of change based on the Structure of Observed Learning Outcomes (SOLO). In the present descriptive-survey research, the statistical population included undergraduate students (basic sciences and engineering) in the universities of Tehran. The samples were 604 students selected by random multi-stage clustering. The measurement tool was a task whose face and content validity was confirmed by math and mathematics education professors. Using Cronbach's Alpha criterion, the reliability coefficient of the task was obtained 0.95, which verified its reliability. The collected data were analyzed by descriptive statistics and inferential statistics (chi-squared and independent t-tests) under SPSS-24 software. According to the SOLO model in the prestructural, unistructural, and multistructural levels, basic science students had a higher percentage of understanding than that of engineering students, although the outcome was inverse at the relational level. However, there was no significant difference in the average understanding of both groups. The results indicated that students failed to have a proper understanding of the numerical representation of the average rate of change, in addition to missconceptions when using physics formulas in solving the problem. In addition, multiple solutions were derived along with their dominant methods during the qualitative analysis. The current research proposed to focus on the context problems with approximate calculations and numerical representation, using software and connection common relations between math and physics in the teaching process of teachers and professors.

Association of Smoking with Chest Radiographic and Lung Function Findings in Retired Bauxite Mining Workers

Inhalation hazards are associated with potentially injurious exposure and increased risk for lung diseases, within the bauxite mining industry, especially for the smelter workers. Smoking is related to decreased lung function and leads to chronic lung diseases. This study had the objective to evaluate whether smoking is related to functional and radiographic respiratory changes in retired bauxite mining workers. Methods: This was a retrospective and cross-sectional study involving the analysis of database information of 140 retired bauxite mining workers from Poços de Caldas-MG evaluated at Worker’s Health Reference Center and at the Social Security Brazilian National Institute, from July 1st, 2015 until June 30th, 2016. The workers were divided into three groups: non-smokers (n = 47), ex-smokers (n = 46), and smokers (n = 47). The data included: age, gender, spirometry results, and the presence or not of pulmonary pleural and/or parenchymal changes in chest radiographs. Chi-Squared test was used (p < 0,05). Results: In the smokers’ group, 83% of spirometry tests and 64% of chest x-rays were altered. In the non-smokers’ group, 19% of spirometry tests and 13% of chest x-rays were altered. In the ex-smokers’ group, 35% of spirometry tests and 30% of chest x-rays were altered. Most of the results were statistically significant. Results demonstrated a significant difference between smokers’ and non-smokers’ groups in regard to spirometric and radiographic pulmonary alterations. Ex-smokers’ and non-smokers’ group demonstrated better results when compared to the smokers’ group in relation to altered spirometry and radiograph findings. These data may contribute to planning strategies to enhance smoking cessation programs within the bauxite mining industry.

The Development of Student Core Competencies through the STEM Education Opportunities in Classroom

The goal of the modern education system is to prepare students to be able to adapt to ever-changing life situations. They must be able to acquire required knowledge independently; apply such knowledge in practice to solve various problems by using modern technologies; think critically and creatively; competently use information; be communicative, work in a team; and develop their own moral values, intellect and cultural awareness. As a result, the status of education significantly increases; new requirements to its quality have been formed. In recent years the competency-based approach in education has become of significant interest. This approach is a strengthening of applied and practical characteristics of a school education and leads to the forming of the key students’ competencies which define their success in future life. In this article, the authors’ attention focuses on a range of key competencies, educational, informational and communicative and on the possibility to develop such competencies via STEM education. This research shows the change in students’ attitude towards scientific disciplines such as mathematics, general science, technology and engineering as a result of STEM education. Two staged analyzed questionnaires completed by students of forms II to IV in the republic of Trinidad and Tobago allowed the authors to categorize students between two levels that represent students’ attitude to various disciplines. The significance of differences between selected levels was confirmed with the use of Pearson’s chi-squared test. In summary, the analysis of obtained data makes it possible to conclude that STEM education has a great potential for development of core students’ competencies and encourage the development of positive student attitude towards the above mentioned above scientific disciplines.

Physicians’ Knowledge and Perception of Gene Profiling in Malaysia

Availability of different genetic tests after completion of Human Genome Project increases the physicians’ responsibility to keep themselves update on the potential implementation of these genetic tests in their daily practice. However, due to numbers of barriers, still many of physicians are not either aware of these tests or are not willing to offer or refer their patients for genetic tests. This study was conducted an anonymous, cross-sectional, mailed-based survey to develop a primary data of Malaysian physicians’ level of knowledge and perception of gene profiling. Questionnaire had 29 questions. Total scores on selected questions were used to assess the level of knowledge. The highest possible score was 11. Descriptive statistics, one way ANOVA and chi-squared test was used for statistical analysis. Sixty three completed questionnaires were returned by 27 general practitioners (GPs) and 36 medical specialists. Responders’ age ranges from 24 to 55 years old (mean 30.2 ± 6.4). About 40% of the participants rated themselves as having poor level of knowledge in genetics in general whilst 60% believed that they have fair level of knowledge; however, almost half (46%) of the respondents felt that they were not knowledgeable about available genetic tests. A majority (94%) of the responders were not aware of any lab or company which is offering gene profiling services in Malaysia. Only 4% of participants were aware of using gene profiling for detection of dosage of some drugs. Respondents perceived greater utility of gene profiling for breast cancer (38%) compared to the colorectal familial cancer (3%). The score of knowledge ranged from 2 to 8 (mean 4.38 ± 1.67). Non- significant differences between score of knowledge of GPs and specialists were observed, with score of 4.19 and 4.58 respectively. There was no significant association between any demographic factors and level of knowledge. However, those who graduated between years 2001 to 2005 had higher level of knowledge. Overall, 83% of participants showed relatively high level of perception on value of gene profiling to detect patient’s risk of disease. However, low perception was observed for both statements of using gene profiling for general population in order to alter their lifestyle (25%) as well as having the full sequence of a patient genome for the purpose of determining a patient’s best match for treatment (18%). The lack of clinical guidelines, limited provider knowledge and awareness, lack of time and resources to educate patients, lack of evidence-based clinical information and cost of tests were the most barriers of ordering gene profiling mentioned by physicians. In conclusion Malaysian physicians who participate in this study had mediocre level of knowledge and awareness in gene profiling. The low exposure to the genetic questions and problems might be a key predictor of lack of awareness and knowledge on available genetic tests. Educational and training workshop might be useful in helping Malaysian physicians incorporate genetic profiling into practice for eligible patients.

The Application of Queuing Theory in Multi-Stage Production Lines

The purpose of this work is examining the multiproduct multi-stage in a battery production line. To improve the performances of an assembly production line by determine the efficiency of each workstation. Data collected from every workstation. The data are throughput rate, number of operator, and number of parts that arrive and leaves during part processing. Data for the number of parts that arrives and leaves are collected at least at the amount of ten samples to make the data is possible to be analyzed by Chi-Squared Goodness Test and queuing theory. Measures of this model served as the comparison with the standard data available in the company. Validation of the task time value resulted by comparing it with the task time value based on the company database. Some performance factors for the multi-product multi-stage in a battery production line in this work are shown. The efficiency in each workstation was also shown. Total production time to produce each part can be determined by adding the total task time in each workstation. To reduce the queuing time and increase the efficiency based on the analysis any probably improvement should be done. One probably action is by increasing the number of operators how manually operate this workstation.

On the Analysis of IP Traffic Distribution in the Network of Suranaree University of Technology

This paper presents the IP traffic analysis. The traffic was collected from the network of Suranaree University of Technology using the software based on the Simple Network Management Protocol (SNMP). In particular, we analyze the distribution of the aggregated traffic during the hours of peak load and light load. The traffic profiles including the parameters described the traffic distributions were derived. From the statistical analysis applying three different methods, including the Kolmogorov Smirnov test, Anderson Darling test, and Chi-Squared test, we found that the IP traffic distribution is a non-normal distribution and the distributions during the peak load and the light load are different. The experimental study and analysis show high uncertainty of the IP traffic.

Information Filtering using Index Word Selection based on the Topics

We have proposed an information filtering system using index word selection from a document set based on the topics included in a set of documents. This method narrows down the particularly characteristic words in a document set and the topics are obtained by Sparse Non-negative Matrix Factorization. In information filtering, a document is often represented with the vector in which the elements correspond to the weight of the index words, and the dimension of the vector becomes larger as the number of documents is increased. Therefore, it is possible that useless words as index words for the information filtering are included. In order to address the problem, the dimension needs to be reduced. Our proposal reduces the dimension by selecting index words based on the topics included in a document set. We have applied the Sparse Non-negative Matrix Factorization to the document set to obtain these topics. The filtering is carried out based on a centroid of the learning document set. The centroid is regarded as the user-s interest. In addition, the centroid is represented with a document vector whose elements consist of the weight of the selected index words. Using the English test collection MEDLINE, thus, we confirm the effectiveness of our proposal. Hence, our proposed selection can confirm the improvement of the recommendation accuracy from the other previous methods when selecting the appropriate number of index words. In addition, we discussed the selected index words by our proposal and we found our proposal was able to select the index words covered some minor topics included in the document set.

The Impact of HIV/AIDS on Micro-enterprise Development in Kenya: A Study of Obunga Slum in Kisumu

The performances of small and medium enterprises have stagnated in the last two decades. This has mainly been due to the emergence of HIV / Aids. The disease has had a detrimental effect on the general economy of the country leading to morbidity and mortality of the Kenyan workforce in their primary age. The present study sought to establish the economic impact of HIV / Aids on the micro-enterprise development in Obunga slum – Kisumu, in terms of production loss, increasing labor related cost and to establish possible strategies to address the impact of HIV / Aids on microenterprises. The study was necessitated by the observation that most micro-enterprises in the slum are facing severe economic and social crisis due to the impact of HIV / Aids, they get depleted and close down within a short time due to death of skilled and experience workforce. The study was carried out between June 2008 and June 2009 in Obunga slum. Data was subjected to computer aided statistical analysis that included descriptive statistic, chi-squared and ANOVA techniques. Chi-squared analysis on the micro-enterprise owners opinion on the impact of HIV / Aids on depletion of microenterprise compared to other diseases indicated high levels of the negative effects of the disease at significance levels of P

Probability Distribution of Rainfall Depth at Hourly Time-Scale

Rainfall data at fine resolution and knowledge of its characteristics plays a major role in the efficient design and operation of agricultural, telecommunication, runoff and erosion control as well as water quality control systems. The paper is aimed to study the statistical distribution of hourly rainfall depth for 12 representative stations spread across Peninsular Malaysia. Hourly rainfall data of 10 to 22 years period were collected and its statistical characteristics were estimated. Three probability distributions namely, Generalized Pareto, Exponential and Gamma distributions were proposed to model the hourly rainfall depth, and three goodness-of-fit tests, namely, Kolmogorov-Sminov, Anderson-Darling and Chi-Squared tests were used to evaluate their fitness. Result indicates that the east cost of the Peninsular receives higher depth of rainfall as compared to west coast. However, the rainfall frequency is found to be irregular. Also result from the goodness-of-fit tests show that all the three models fit the rainfall data at 1% level of significance. However, Generalized Pareto fits better than Exponential and Gamma distributions and is therefore recommended as the best fit.

Methods for Data Selection in Medical Databases: The Binary Logistic Regression -Relations with the Calculated Risks

The medical studies often require different methods for parameters selection, as a second step of processing, after the database-s designing and filling with information. One common task is the selection of fields that act as risk factors using wellknown methods, in order to find the most relevant risk factors and to establish a possible hierarchy between them. Different methods are available in this purpose, one of the most known being the binary logistic regression. We will present the mathematical principles of this method and a practical example of using it in the analysis of the influence of 10 different psychiatric diagnostics over 4 different types of offences (in a database made from 289 psychiatric patients involved in different types of offences). Finally, we will make some observations about the relation between the risk factors hierarchy established through binary logistic regression and the individual risks, as well as the results of Chi-squared test. We will show that the hierarchy built using the binary logistic regression doesn-t agree with the direct order of risk factors, even if it was naturally to assume this hypothesis as being always true.

AudioMine: Medical Data Mining in Heterogeneous Audiology Records

We report on the results of a pilot study in which a data-mining tool was developed for mining audiology records. The records were heterogeneous in that they contained numeric, category and textual data. The tools developed are designed to observe associations between any field in the records and any other field. The techniques employed were the statistical chi-squared test, and the use of self-organizing maps, an unsupervised neural learning approach.