Abstract: This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.
Abstract: Segmentation and quantification of stenosis is an
important task in assessing coronary artery disease. One of the main
challenges is measuring the real diameter of curved vessels.
Moreover, uncertainty in segmentation of different tissues in the
narrow vessel is an important issue that affects accuracy. This paper
proposes an algorithm to extract coronary arteries and measure the
degree of stenosis. Markovian fuzzy clustering method is applied to
model uncertainty arises from partial volume effect problem. The
algorithm employs: segmentation, centreline extraction, estimation of
orthogonal plane to centreline, measurement of the degree of
stenosis. To evaluate the accuracy and reproducibility, the approach
has been applied to a vascular phantom and the results are compared
with real diameter. The results of 10 patient datasets have been
visually judged by a qualified radiologist. The results reveal the
superiority of the proposed method compared to the Conventional
thresholding Method (CTM) on both datasets.
Abstract: In this paper, the concepts of dichotomous logistic
regression (DLR) with leave-one-out (L-O-O) were discussed. To
illustrate this, the L-O-O was run to determine the importance of the
simulation conditions for robust test of spread procedures with good
Type I error rates. The resultant model was then evaluated. The
discussions included 1) assessment of the accuracy of the model, and
2) parameter estimates. These were presented and illustrated by
modeling the relationship between the dichotomous dependent
variable (Type I error rates) with a set of independent variables (the
simulation conditions). The base SAS software containing PROC
LOGISTIC and DATA step functions can be making used to do the
DLR analysis.
Abstract: This paper details the application of a genetic
programming framework for induction of useful classification rules
from a database of income statements, balance sheets, and cash flow
statements for North American public companies. Potentially
interesting classification rules are discovered. Anomalies in the
discovery process merit further investigation of the application of
genetic programming to the dataset for the problem domain.
Abstract: Distance visualization of large datasets often takes the direction of remote viewing and zooming techniques of stored static images. However, the continuous increase in the size of datasets and visualization operation causes insufficient performance with traditional desktop computers. Additionally, the visualization techniques such as Isosurface depend on the available resources of the running machine and the size of datasets. Moreover, the continuous demand for powerful computing powers and continuous increase in the size of datasets results an urgent need for a grid computing infrastructure. However, some issues arise in current grid such as resources availability at the client machines which are not sufficient enough to process large datasets. On top of that, different output devices and different network bandwidth between the visualization pipeline components often result output suitable for one machine and not suitable for another. In this paper we investigate how the grid services could be used to support remote visualization of large datasets and to break the constraint of physical co-location of the resources by applying the grid computing technologies. We show our grid enabled architecture to visualize large medical datasets (circa 5 million polygons) for remote interactive visualization on modest resources clients.
Abstract: EPA (Ethernet for Plant Automation) resolves the nondeterministic problem of standard Ethernet and accomplishes real-time communication by means of micro-segment topology and deterministic scheduling mechanism. This paper studies the real-time performance of EPA periodic data transmission from theoretical and experimental perspective. By analyzing information transmission characteristics and EPA deterministic scheduling mechanism, 5 indicators including delivery time, time synchronization accuracy, data-sending time offset accuracy, utilization percentage of configured timeslice and non-RTE bandwidth that can be used to specify the real-time performance of EPA periodic data transmission are presented and investigated. On this basis, the test principles and test methods of the indicators are respectively studied and some formulas for real-time performance of EPA system are derived. Furthermore, an experiment platform is developed to test the indicators of EPA periodic data transmission in a micro-segment. According to the analysis and the experiment, the methods to improve the real-time performance of EPA periodic data transmission including optimizing network structure, studying self-adaptive adjustment method of timeslice and providing data-sending time offset accuracy for configuration are proposed.
Abstract: This study introduces a new method for detecting,
sorting, and localizing spikes from multiunit EEG recordings. The
method combines the wavelet transform, which localizes distinctive
spike features, with Super-Paramagnetic Clustering (SPC) algorithm,
which allows automatic classification of the data without assumptions
such as low variance or Gaussian distributions. Moreover, the method
is capable of setting amplitude thresholds for spike detection. The
method makes use of several real EEG data sets, and accordingly the
spikes are detected, clustered and their times were detected.
Abstract: The goal of data mining algorithms is to discover
useful information embedded in large databases. One of the most
important data mining problems is discovery of frequently occurring
patterns in sequential data. In a multidimensional sequence each
event depends on more than one dimension. The search space is quite
large and the serial algorithms are not scalable for very large
datasets. To address this, it is necessary to study scalable parallel
implementations of sequence mining algorithms.
In this paper, we present a model for multidimensional sequence
and describe a parallel algorithm based on data parallelism.
Simulation experiments show good load balancing and scalable and
acceptable speedup over different processors and problem sizes and
demonstrate that our approach can works efficiently in a real parallel
computing environment.
Abstract: The Economic factors are leading to the rise of
infrastructures provides software and computing facilities as a
service, known as cloud services or cloud computing. Cloud services
can provide efficiencies for application providers, both by limiting
up-front capital expenses, and by reducing the cost of ownership over
time. Such services are made available in a data center, using shared
commodity hardware for computation and storage. There is a varied
set of cloud services available today, including application services
(salesforce.com), storage services (Amazon S3), compute services
(Google App Engine, Amazon EC2) and data services (Amazon
SimpleDB, Microsoft SQL Server Data Services, Google-s Data
store). These services represent a variety of reformations of data
management architectures, and more are on the horizon.
Abstract: Present paper presents a parametric performancebased
design model for optimizing hospital design. The design model
operates with geometric input parameters defining the functional
requirements of the hospital and input parameters in terms of
performance objectives defining the design requirements and
preferences of the hospital with respect to performances. The design
model takes point of departure in the hospital functionalities as a set
of defined parameters and rules describing the design requirements
and preferences.
Abstract: The Programmable Logic Controller (PLC) plays a
vital role in automation and process control. Grafcet is used for
representing the control logic, and traditional programming
languages are used for describing the pure algorithms. Grafcet is used
for dividing the process to be automated in elementary sequences that
can be easily implemented. Each sequence represent a step that has
associated actions programmed using textual or graphical languages
after case. The programming task is simplified by using a set of
subroutines that are used in several steps. The paper presents an
example of implementation for a punching machine for sheets and
plates. The use the graphical languages the programming of a
complex sequential process is a necessary solution. The state of
Grafcet can be used for debugging and malfunction determination.
The use of the method combined with a set of knowledge acquisition
for process application reduces the downtime of the machine and
improve the productivity.
Abstract: Empirical insights into the implementation of logistics competencies at the top management level are scarce. This paper addresses this issue with an explorative approach which is based on a dataset of 872 observations in the years 2000, 2004 and 2008 using quantitative content analysis from annual reports of the 500 publicly listed firms with the highest global research and development expenditures according to the British Department for Business Innovation and Skills. We find that logistics competencies are more pronounced in Asian companies than in their European or American counterparts. On an industrial level the results are quite mixed. Using partial point-biserial correlations we show that logistics competencies are positively related to financial performance.
Abstract: The rapid growth of e-Commerce services is
significantly observed in the past decade. However, the method to
verify the authenticated users still widely depends on numeric
approaches. A new search on other verification methods suitable for
online e-Commerce is an interesting issue. In this paper, a new online
signature-verification method using angular transformation is
presented. Delay shifts existing in online signatures are estimated by
the estimation method relying on angle representation. In the
proposed signature-verification algorithm, all components of input
signature are extracted by considering the discontinuous break points
on the stream of angular values. Then the estimated delay shift is
captured by comparing with the selected reference signature and the
error matching can be computed as a main feature used for verifying
process. The threshold offsets are calculated by two types of error
characteristics of the signature verification problem, False Rejection
Rate (FRR) and False Acceptance Rate (FAR). The level of these two
error rates depends on the decision threshold chosen whose value is
such as to realize the Equal Error Rate (EER; FAR = FRR). The
experimental results show that through the simple programming,
employed on Internet for demonstrating e-Commerce services, the
proposed method can provide 95.39% correct verifications and 7%
better than DP matching based signature-verification method. In
addition, the signature verification with extracting components
provides more reliable results than using a whole decision making.
Abstract: In the present paper, a set of parametric FE stress
analyses is carried out for two-planar welded tubular DKT-joints
under two different axial load cases. Analysis results are used to
present general remarks on the effect of geometrical parameters on
the stress concentration factors (SCFs) at the inner saddle, outer
saddle, toe, and heel positions on the main (outer) brace. Then a new
set of SCF parametric equations is developed through nonlinear
regression analysis for the fatigue design of two-planar DKT-joints.
An assessment study of these equations is conducted against the
experimental data; and the satisfaction of the criteria regarding the
acceptance of parametric equations is checked. Significant effort has
been devoted by researchers to the study of SCFs in various uniplanar
tubular connections. Nevertheless, for multi-planar joints
covering the majority of practical applications, very few
investigations have been reported due to the complexity and high
cost involved.
Abstract: This paper considers the influence of promotion
instruments for renewable energy sources (RES) on a multi-energy
modeling framework. In Europe, so called Feed-in Tariffs are
successfully used as incentive structures to increase the amount of
energy produced by RES. Because of the stochastic nature of large
scale integration of distributed generation, many problems have
occurred regarding the quality and stability of supply. Hence, a
macroscopic model was developed in order to optimize the power
supply of the local energy infrastructure, which includes electricity,
natural gas, fuel oil and district heating as energy carriers. Unique
features of the model are the integration of RES and the adoption of
Feed-in Tariffs into one optimization stage. Sensitivity studies are
carried out to examine the system behavior under changing profits
for the feed-in of RES. With a setup of three energy exchanging
regions and a multi-period optimization, the impact of costs and
profits are determined.
Abstract: The Carrier Frequency Offset (CFO) due to timevarying
fading channel is the main cause of the loss of orthogonality
among OFDM subcarriers which is linked to inter-carrier interference
(ICI). Hence, it is necessary to precisely estimate and compensate the
CFO. Especially for mobile broadband communications, CFO and
channel gain also have to be estimated and tracked to maintain the
system performance. Thus, synchronization pilots are embedded in
every OFDM symbol to track the variations. In this paper, we present
the pilot scheme for both channel and CFO estimation where channel
estimation process can be carried out with only one OFDM symbol.
Additional, the proposed pilot scheme also provides better
performance in CFO estimation comparing with the conventional
orthogonal pilot scheme due to the increasing of signal-tointerference
ratio.
Abstract: This study investigated students- perception of self
efficacy and anxiety in acquiring English language, and consequently
examined the relationship existing among the independent variables,
confounding variables and students- performances in the English
language. The researcher tested the research hypotheses using a
sample group of 318 respondents out of the population size of 400
students. The results obtained revealed that there was a significant
moderate negative relationship between English language anxiety
and performance in English language, but no significant relationship
between self-efficacy and English language performance, among the
middle-school students. There was a significant moderate negative
relationship between English language anxiety and self-efficacy. It
was discovered that general self-efficacy and English language
anxiety represented a significantly more powerful set of predictors
than the set of confounding variables. Thus, the study concluded that
English language anxiety and general self-efficacy were significant
predictors of English language performance among middle-school
students in Satri Si Suriyothai School.
Abstract: In this paper, we evaluate the performance of some wavelet based coding algorithms such as 3D QT-L, 3D SPIHT and JPEG2K. In the first step we achieve an objective comparison between three coders, namely 3D SPIHT, 3D QT-L and JPEG2K. For this purpose, eight MRI head scan test sets of 256 x 256x124 voxels have been used. Results show superior performance of 3D SPIHT algorithm, whereas 3D QT-L outperforms JPEG2K. The second step consists of evaluating the robustness of 3D SPIHT and JPEG2K coding algorithm over wireless transmission. Compressed dataset images are then transmitted over AWGN wireless channel or over Rayleigh wireless channel. Results show the superiority of JPEG2K over these two models. In fact, it has been deduced that JPEG2K is more robust regarding coding errors. Thus we may conclude the necessity of using corrector codes in order to protect the transmitted medical information.
Abstract: CT assessment of postoperative spine is challenging in the presence of metal streak artifacts that could deteriorate the
quality of CT images. In this paper, we studied the influence of different acquisition parameters on the magnitude of metal streaking.
A water-bath phantom was constructed with metal insertion similar with postoperative spine assessment. The phantom was scanned with
different acquisition settings and acquired data were reconstructed
using various reconstruction settings. Standardized ROIs were defined within streaking region for image analysis. The result shows
increased kVp and mAs enhanced SNR values by reducing image
noise. Sharper kernel enhanced image quality compared to smooth
kernel, but produced more noise in the images with higher CT fluctuation. The noise between both kernels were significantly
different (P
Abstract: The paper discusses the mathematics of pattern
indexing and its applications to recognition of visual patterns that are
found in video clips. It is shown that (a) pattern indexes can be
represented by collections of inverted patterns, (b) solutions to
pattern classification problems can be found as intersections and
histograms of inverted patterns and, thus, matching of original
patterns avoided.