Abstract: Wikis are considered to be part of Web 2.0
technologies that potentially support collaborative learning and
writing. Wikis provide opportunities for multiple users to work on
the same document simultaneously. Most wikis have also a page for
written group discussion. Nevertheless, wikis may be used in
different ways depending on the pedagogy being used, and the
constraints imposed by the course design. This work explores
students- uses of wiki in teacher education. The analysis is based on a
taxonomy for classifying students- activities and actions carried out
on the wiki. The article also discusses the implications for using
wikis as collaborative writing tools in teacher education.
Abstract: Detection of incipient abnormal events is important to
improve safety and reliability of machine operations and reduce losses
caused by failures. Improper set-ups or aligning of parts often leads to
severe problems in many machines. The construction of prediction
models for predicting faulty conditions is quite essential in making
decisions on when to perform machine maintenance. This paper
presents a multivariate calibration monitoring approach based on the
statistical analysis of machine measurement data. The calibration
model is used to predict two faulty conditions from historical reference
data. This approach utilizes genetic algorithms (GA) based variable
selection, and we evaluate the predictive performance of several
prediction methods using real data. The results shows that the
calibration model based on supervised probabilistic principal
component analysis (SPPCA) yielded best performance in this work.
By adopting a proper variable selection scheme in calibration models,
the prediction performance can be improved by excluding
non-informative variables from their model building steps.
Abstract: Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.
Abstract: The analysis of Acoustic Emission (AE) signal
generated from metal cutting processes has often approached
statistically. This is due to the stochastic nature of the emission
signal as a result of factors effecting the signal from its generation
through transmission and sensing. Different techniques are applied in
this manner, each of which is suitable for certain processes. In metal
cutting where the emission generated by the deformation process is
rather continuous, an appropriate method for analysing the AE signal
based on the root mean square (RMS) of the signal is often used and
is suitable for use with the conventional signal processing systems.
The aim of this paper is to set a strategy in tool failure detection in
turning processes via the statistic analysis of the AE generated from
the cutting zone. The strategy is based on the investigation of the
distribution moments of the AE signal at predetermined sampling.
The skews and kurtosis of these distributions are the key elements in
the detection. A normal (Gaussian) distribution has first been
suggested then this was eliminated due to insufficiency. The so
called Beta distribution was then considered, this has been used with
an assumed β density function and has given promising results with
regard to chipping and tool breakage detection.
Abstract: A 7-step method (with 25 sub-steps) to assess risk of
air pollutants is introduced. These steps are: pre-considerations,
sampling, statistical analysis, exposure matrix and likelihood, doseresponse
matrix and likelihood, total risk evaluation, and discussion
of findings. All mentioned words and expressions are wellunderstood;
however, almost all steps have been modified, improved,
and coupled in such a way that a comprehensive method has been
prepared. Accordingly, the SADRA (Statistical Analysis-Driven Risk
Assessment) emphasizes extensive and ongoing application of
analytical statistics in traditional risk assessment models. A Sulfur
Dioxide case study validates the claim and provides a good
illustration for this method.
Abstract: Image processing for capsule endoscopy requires large
memory and it takes hours for diagnosis since operation time is
normally more than 8 hours. A real-time analysis algorithm of capsule
images can be clinically very useful. It can differentiate abnormal
tissue from health structure and provide with correlation information
among the images. Bleeding is our interest in this regard and we
propose a method of detecting frames with potential bleeding in
real-time. Our detection algorithm is based on statistical analysis and
the shapes of bleeding spots. We tested our algorithm with 30 cases of
capsule endoscopy in the digestive track. Results were excellent where
a sensitivity of 99% and a specificity of 97% were achieved in
detecting the image frames with bleeding spots.
Abstract: Landscape connectivity combines a description of the
physical structure of the landscape with special species- response to
that structure, which forms the theoretical background of applying
landscape connectivity principles in the practices of landscape
planning and design. In this study, a residential development project in
the southern United States was used to explore the meaning of
landscape connectivity and its application in town planning. The vast
rural landscape in the southern United States is conspicuously
characterized by the hedgerow trees or groves. The patchwork
landscape of fields surrounded by high hedgerows is a traditional and
familiar feature of the American countryside. Hedgerows are in effect
linear strips of trees, groves, or woodlands, which are often critical
habitats for wildlife and important for the visual quality of the
landscape. Based on geographic information system (GIS) and
statistical analysis (FRAGSTAT), this study attempts to quantify the
landscape connectivity characterized by hedgerows in south Alabama
where substantial areas of authentic hedgerow landscape are being
urbanized due to the ever expanding real estate industry and high
demand for new residential development. The results of this study
shed lights on how to balance the needs of new urban development and
biodiversity conservation by maintaining a higher level of landscape
connectivity, thus will inform the design intervention.
Abstract: Complex statistical analysis of stresses in concrete
slab of the real type of rigid pavement is performed. The
computational model of the pavement is designed as a spatial (3D) model, is based on a nonlinear variant of the finite element method
that respects the structural nonlinearity, enables to model different arrangement of joints, and the entire model can be loaded by the
thermal load. Interaction of adjacent slabs in joints and contact of the slab and the subsequent layer are modeled with help of special
contact elements. Four concrete slabs separated by transverse and
longitudinal joints and the additional subgrade layers and soil to the depth of about 3m are modeled. The thickness of individual layers,
physical and mechanical properties of materials, characteristics of
joints, and the temperature of the upper and lower surface of slabs are supposed to be random variables. The modern simulation technique
Updated Latin Hypercube Sampling with 20 simulations is used for statistical analysis. As results, the estimates of basic statistics of the
principal stresses s1 and s3 in 53 points on the upper and lower surface of the slabs are obtained.
Abstract: The result of process of territory-s development is the territory-s state of development (TSoD), which is pointed towards the provision and improvement of people-s life conditions. The authors offer to measure the TSoD according to their own developed model. Using the available statistical data regarding the values of model-s elements, the authors empirically show which element mainly determines the TSoD. The findings of the research showed that the key elements of the TSoD are the “Material welfare of people" and “People-s health". Performing a deeper statistical analysis of correlation between these elements, it turned out that it is not so necessary for a country to be bent on trying to increase the material growth of a territory, because a relatively high index of life expectancy at birth could be ensured also by much more modest material resources. On the other hand, the economical feedback of longer lifespan within countries with lower material performance is also relatively low.
Abstract: Drought is one of the most important natural disasters which is probable to occur in all regions with completely different climates and in addition to causing death. It results in many economic losses and social consequences. For this reason. Studying the effects and losses caused by drought which include limitation or shortage of agricultural and drinking water resources. Decreased rainfall and increased evapotranspiration. Limited plant growth and decreased agricultural products. Especially those of dry-farming. Lower levels of surface and ground waters and increased immigrations. Etc. in the country is statistical period (1988-2007) for six stations in Roudbar town were used for statistical analysis and calculating humid and dry years. The dependable rainfall index (DRI) was the main method used in this research. Results showed that during the said statistical period and also during the years 1996-1998 and 2007. more than half of the stations had faced drought. With consideration of the conducted studies. Drawing diagrams and comparing the available data with those of dry and humid years it was found that drought affected agricultural products (e.g.olive) in a way that during the year 1996 1996 drought. Olive groves of Roudbar suffered the greatest damages. Whereupon about 70% of the crops were lost.
Abstract: Plackett-Burman statistical screening of media
constituents and operational conditions for extracellular lipase
production from isolate Trichoderma viride has been carried out in
submerged fermentation. This statistical design is used in the early
stages of experimentation to screen out unimportant factors from a
large number of possible factors. This design involves screening of
up to 'n-1' variables in just 'n' number of experiments. Regression
coefficients and t-values were calculated by subjecting the
experimental data to statistical analysis using Minitab version 15.
The effects of nine process variables were studied in twelve
experimental trials. Maximum lipase activity of 7.83 μmol /ml /min
was obtained in the 6th trail. Pareto chart illustrates the order of
significance of the variables affecting the lipase production. The
present study concludes that the most significant variables affecting
lipase production were found to be palm oil, yeast extract, K2HPO4,
MgSO4 and CaCl2.
Abstract: Thrombosis can be life threatening, necessitating therefore its instant treatment. Hydergine, a nootropic agent is used as a cognition enhancer in stroke patients but relatively little is known about its anti-thrombolytic effect. To investigate this aspect, in vivo and ex vivo experiments were designed and conducted. Three groups of rats were injected 1.5mg, 3.0mg and 4.5mg hydergine intraperitonealy with and without prior exposure to fresh plasma. Positive and negative controls were run in parallel. Animals were sacrificed after 1.5hrs and BT, CT, PT, INR, APTT, plasma calcium levels were estimated. For ex vivo analyses, each 1ml blood aspirated was exposed to 0.1mg, 0.2mg, 0.3mg dose of hydergine with parallel controls. Parameters analyzed were as above. Statistical analysis was through one-way ANOVA. Dunken-s and Tukey-s tests provided intra-group variance. BT, CT, PT, INR and APTT increased while calcium levels dropped significantly (P
Abstract: Robots- visual perception is a field that is gaining
increasing attention from researchers. This is partly due to emerging
trends in the commercial availability of 3D scanning systems or
devices that produce a high information accuracy level for a variety of
applications. In the history of mining, the mortality rate of mine workers
has been alarming and robots exhibit a great deal of potentials to
tackle safety issues in mines. However, an effective vision system
is crucial to safe autonomous navigation in underground terrains.
This work investigates robots- perception in underground terrains
(mines and tunnels) using statistical region merging (SRM) model.
SRM reconstructs the main structural components of an imagery
by a simple but effective statistical analysis. An investigation is
conducted on different regions of the mine, such as the shaft, stope
and gallery, using publicly available mine frames, with a stream of
locally captured mine images. An investigation is also conducted on a
stream of underground tunnel image frames, using the XBOX Kinect
3D sensors. The Kinect sensors produce streams of red, green and
blue (RGB) and depth images of 640 x 480 resolution at 30 frames per
second. Integrating the depth information to drivability gives a strong
cue to the analysis, which detects 3D results augmenting drivable and
non-drivable regions in 2D. The results of the 2D and 3D experiment
with different terrains, mines and tunnels, together with the qualitative
and quantitative evaluation, reveal that a good drivable region can be
detected in dynamic underground terrains.
Abstract: The aim of the study was to identify seat belt wearing
factor among road users in Malaysia. Evidence-based approach
through in-depth crash investigation was utilised to determine the
intended objectives. The objective was scoped into crashes
investigated by Malaysian Institute of Road Safety Research
(MIROS) involving passenger vehicles within 2007 and 2010. Crash
information of a total of 99 crash cases involving 240 vehicles and
864 occupants were obtained during the study period. Statistical test
and logistic regression analysis have been performed. Results of the
analysis revealed that gender, seat position and age were associated
with seat belt wearing compliance in Malaysia. Males are 97.6%
more likely to wear seat belt compared to females (95% CI 1.317 to
2.964). By seat position, the finding indicates that frontal occupants
were 82 times more likely to be wearing seat belt (95% CI 30.199 to
225.342) as compared to rear occupants. It is also important to note
that the odds of seat belt wearing increased by about 2.64% (95% CI
1.0176 to 1.0353) for every one year increase in age. This study is
essential in understanding the Malaysian tendency in belting up
while being occupied in a vehicle. The factors highlighted in this
study should be emphasized in road safety education in order to
increase seat belt wearing rate in this country and ultimately in
preventing deaths due to road crashes.
Abstract: The number of features required to represent an image
can be very huge. Using all available features to recognize objects
can suffer from curse dimensionality. Feature selection and
extraction is the pre-processing step of image mining. Main issues in
analyzing images is the effective identification of features and
another one is extracting them. The mining problem that has been
focused is the grouping of features for different shapes. Experiments
have been conducted by using shape outline as the features. Shape
outline readings are put through normalization and dimensionality
reduction process using an eigenvector based method to produce a
new set of readings. After this pre-processing step data will be
grouped through their shapes. Through statistical analysis, these
readings together with peak measures a robust classification and
recognition process is achieved. Tests showed that the suggested
methods are able to automatically recognize objects through their
shapes. Finally, experiments also demonstrate the system invariance
to rotation, translation, scale, reflection and to a small degree of
distortion.
Abstract: This paper investigates the spatial structure of employment in the Jakarta Metropolitan Area (JMA), with reference to the concept of the Southeast Asian extended metropolitan region (EMR). A combination of factor analysis and local Getis-Ord (Gi*) hot-spot analysis is used to identify clusters of employment in the region, including those of the urban and agriculture sectors. Spatial statistical analysis is further used to probe the spatial association of identified employment clusters with their surroundings on several dimensions, including the spatial association between the central business district (CBD) in Jakarta city on employment density in the region, the spatial impacts of urban expansion on population growth and the degree of urban-rural interaction. The degree of spatial interaction for the whole JMA is measured by the patterns of commuting trips destined to the various employment clusters. Results reveal the strong role of the urban core of Jakarta, and the regional CBD, as the centre for mixed job sectors such as retail, wholesale, services and finance. Manufacturing and local government services, on the other hand, form corridors radiating out of the urban core, reaching out to the agriculture zones in the fringes. Strong associations between the urban expansion corridors and population growth, and urban-rural mix, are revealed particularly in the eastern and western parts of JMA. Metropolitan wide commuting patterns are focussed on the urban core of Jakarta and the CBD, while relatively local commuting patterns are shown to be prevalent for the employment corridors.
Abstract: Web usage mining has become a popular research
area, as a huge amount of data is available online. These data can be
used for several purposes, such as web personalization, web structure
enhancement, web navigation prediction etc. However, the raw log
files are not directly usable; they have to be preprocessed in order to
transform them into a suitable format for different data mining tasks.
One of the key issues in the preprocessing phase is to identify web
users. Identifying users based on web log files is not a
straightforward problem, thus various methods have been developed.
There are several difficulties that have to be overcome, such as client
side caching, changing and shared IP addresses and so on. This paper
presents three different methods for identifying web users. Two of
them are the most commonly used methods in web log mining
systems, whereas the third on is our novel approach that uses a
complex cookie-based method to identify web users. Furthermore we
also take steps towards identifying the individuals behind the
impersonal web users. To demonstrate the efficiency of the new
method we developed an implementation called Web Activity
Tracking (WAT) system that aims at a more precise distinction of
web users based on log data. We present some statistical analysis
created by the WAT on real data about the behavior of the Hungarian
web users and a comprehensive analysis and comparison of the three
methods
Abstract: Quantitative measurements of tumor in general and tumor volume in particular, become more realistic with the use of Magnetic Resonance imaging, especially when the tumor morphological changes become irregular and difficult to assess by clinical examination. However, tumor volume estimation strongly depends on the image segmentation, which is fuzzy by nature. In this paper a fuzzy approach is presented for tumor volume segmentation based on the fuzzy connectedness algorithm. The fuzzy affinity matrix resulting from segmentation is then used to estimate a fuzzy volume based on a certainty parameter, an Alpha Cut, defined by the user. The proposed method was shown to highly affect treatment decisions. A statistical analysis was performed in this study to validate the results based on a manual method for volume estimation and the importance of using the Alpha Cut is further explained.
Abstract: The use of artificial neural network (ANN) modeling
for prediction and forecasting variables in water resources
engineering are being increasing rapidly. Infrastructural applications
of ANN in terms of selection of inputs, architecture of networks,
training algorithms, and selection of training parameters in different
types of neural networks used in water resources engineering have
been reported. ANN modeling conducted for water resources
engineering variables (river sediment and discharge) published in
high impact journals since 2002 to 2011 have been examined and
presented in this review. ANN is a vigorous technique to develop
immense relationship between the input and output variables, and
able to extract complex behavior between the water resources
variables such as river sediment and discharge. It can produce robust
prediction results for many of the water resources engineering
problems by appropriate learning from a set of examples. It is
important to have a good understanding of the input and output
variables from a statistical analysis of the data before network
modeling, which can facilitate to design an efficient network. An
appropriate training based ANN model is able to adopt the physical
understanding between the variables and may generate more effective
results than conventional prediction techniques.
Abstract: A novel typical day prediction model have been built and validated by the measured data of a grid-connected solar photovoltaic (PV) system in Macau. Unlike conventional statistical method used by previous study on PV systems which get results by averaging nearby continuous points, the present typical day statistical method obtain the value at every minute in a typical day by averaging discontinuous points at the same minute in different days. This typical day statistical method based on discontinuous point averaging makes it possible for us to obtain the Gaussian shape dynamical distributions for solar irradiance and output power in a yearly or monthly typical day. Based on the yearly typical day statistical analysis results, the maximum possible accumulated output energy in a year with on site climate conditions and the corresponding optimal PV system running time are obtained. Periodic Gaussian shape prediction models for solar irradiance, output energy and system energy efficiency have been built and their coefficients have been determined based on the yearly, maximum and minimum monthly typical day Gaussian distribution parameters, which are obtained from iterations for minimum Root Mean Squared Deviation (RMSD). With the present model, the dynamical effects due to time difference in a day are kept and the day to day uncertainty due to weather changing are smoothed but still included. The periodic Gaussian shape correlations for solar irradiance, output power and system energy efficiency have been compared favorably with data of the PV system in Macau and proved to be an improvement than previous models.