Filter by collection

Current filters:

Current filters:

Refine By:

Search Results

  • previous
  • 1
  • next
Results 1-7 of 7 (Search time: 0.1 seconds).
Item hits:
  • BB


  • Authors: Martin, Niels;  Advisor: -;  Participants: Solti, Andreas; Mendling, Jan; Depaire, Benoit; Caris, An (2020)

  • Batch processing refers to an organization of work in which cases are synchronized such that they can be processed as a group. Prior research has studied batch processing mainly from a deductive angle, trying to identify optimal rules for composing batches. As a consequence, we lack methodological support to investigate according to which rules human resources build batches in work settings where batching rules are not strictly enforced. In this paper, we address this research gap by developing a technique to inductively mine batch activation rules from process execution data. The obtained batch activation rules can be used for various purposes, including to explicate the real-life batching behavior of human resources; to determine the compliance between the prescribed and actual ba...

  • BB


  • Authors: Mart´ınez-Plumed, Fernando;  Advisor: -;  Participants: Contreras-Ochando, Lidia; Ferri, Cesar; Hernandez-Orallo, Jose; Kull, Meelis; Lachiche, Nicolas; Ramırez-Quintana, Marıa Jose; Flach, Peter (2019)

  • CRISP-DM (CRoss-Industry Standard Process for Data Mining) has its origins in the second half of the nineties and is thus about two decades old. According to many surveys and user polls it is still the de facto standard for developing data mining and knowledge discovery projects. However, undoubtedly the field has moved on considerably in twenty years, with data science now the leading term being favoured over data mining. In this paper we investigate whether, and in what contexts, CRISP-DM is still fit for purpose for data science projects. We argue that if the project is goal-directed and process-driven the process model view still largely holds. On the other hand, when data science projects become more exploratory the paths that the project can take become more varied, and a more...

  • BB


  • Authors: Luo, Zhaojing;  Advisor: -;  Participants: Cai, Shaofeng; Chen, Gang; Gao, Jinyang; Lee, Wang-Chien; Ngiam, Kee Yuan; Zhang, Meihui (2019)

  • Deep Learning and Machine Learning models have recently been shown to be effective in many real world applications. While these models achieve increasingly better predictive performance, their structures have also become much more complex. A common and difficult problem for complex models is overfitting. Regularization is used to penalize the complexity of the model in order to avoid overfitting. However, in most learning frameworks, regularization function is usually set with some hyper-parameters where the best setting is difficult to find. In this paper, we propose an adaptive regularization method, as part of a large end-to-end healthcare data analytics software stack, which effectively addresses the above difficulty. First, we propose a general adaptive regularization method ba...

  • BB


  • Authors: Tavakkol, Behnam;  Advisor: -;  Participants: K. Jeong, Myong; L. Albin, Susan (2019)

  • Uncertain data objects are objects that can be characterized by either a probability density function (PDF) or with multiple points. Because of existing levels of uncertainty for uncertain data objects, the scatter of this type of objects might be very different than the scatter of certain data objects. Measures of scatter for uncertain objects have not been defined before. In this paper, we define covariance matrix, within scatter matrix, and between scatter matrix as the measures of scatter for uncertain data objects. Also, in this paper, we extend the idea of Fisher linear discriminant analysis for uncertain objects (UFLDA). We also develop kernel Fisher discriminant analysis for uncertain objects (UKFDA). The developed uncertain kernel Fisher discriminants are for two cases: 1) ...

  • BB


  • Authors: Hung, Shao-Yen;  Advisor: -;  Participants: Lee, Chia-Yen; Lin, Yung-Lun (2020)

  • The transformation of wafers into chips is a complex manufacturing process involving literally thousands of equipment parameters. Delamination, a leading cause of defective products, can occur between die and epoxy molding compound (EMC), epoxy and substrate, lead frame and EMC, etc. Troubleshooting is generally on a case-by-case basis and is both time-consuming and labor intensive. We propose a three-phase data science framework for process prognosis and prediction. The first phase is for data preprocessing. The second phase uses LASSO regression and stepwise regression to identify the key variables affecting delamination. The third phase develops backpropagation neural network (BPNN), support vector regression (SVR), partial least squares (PLS), and gradient boosting machine (GBM)...

  • BB


  • Authors: Khezerlou, Amin Vahedian;  Advisor: -;  Participants: Zhou, Xun; Tong, Ling; Li, Yanhua; Luo, Jun (2019)

  • Identifying urban gathering events is an important problem due to challenges it brings to urban management. Recently, we proposed a hybrid model (H-VIGO-GIS) to predict future gathering events through trajectory destination prediction. Our approach consisted of two models: historical and recent and continuously predicted future gathering events. However, H-VIGO-GIS has limitations. (1) The recent model does not capture the newly-emerged abnormal patterns effectively, since it uses all recent trajectories, including normal ones. (2) The recent model is sparse due to limited number of trajectories it learns, i.e. it cannot produce predictions in many cases, forcing us to rely only on the historical model. (3) The accuracy of both recent and historical models varies by space and time. ...

  • BB


  • Authors: Feng, Tianshu;  Advisor: -;  Participants: I. Davila, Jaime; Liu, Yuanhang; Lin, Sangdi; Huang, Shuai; Wang, Chen (2019)

  • Topological data analysis (TDA) is a powerful method for reducing data dimensionality, mining underlying data relationships, and intuitively representing the data structure. The Mapper algorithm is one such tool that projects highdimensional data to 1-dimensional space by using a filter function that is subsequently used to reconstruct the data topology relationships. However, domain context information and prior knowledge have not been considered in current TDA modeling frameworks. Here, we report the development and evaluation of a semi-supervised topological analysis (STA) framework that incorporates discrete or continuously labeled data points and selects the most relevant filter functions accordingly. We validate the proposed STA framework with simulation data and then apply it...

  • previous
  • 1
  • next