The Annual Meeting of the International Society for Data Science and Analytics (ISDSA) aims to provide a global forum from researchers and practitioners in the field of data science and data analytics to communicate and showcase their latest research. The ISDSA Meeting is for anyone interested in data science and data analytics. Researchers, practitioners, educators, and users come together at the ISDSA Meeting to present new and ongoing work and to discuss future directions for data science and data analytics.
Because of the ongoing pandemic, the 2021 Annual Meeting of ISDSA will be held online on June 5, 2021. We will invite eight speakers to share their research this year. Each speaker will give an hour long address on their past or ongoing research.
There will also be a pre-conference workshop on statistical power analysis by Prof. Johnny Zhang from the University of Notre Dame.
Ordered by last name.
Dr. Mike Cheung -- National University of Singapore
Dr. Mike Cheung is a Professor at the Department of Psychology of the National University of Singapore. His research interests are quantitative methods, especially in the topics of meta-analysis, structural equation modeling, and multilevel modeling. His current research focus is integrating meta-analysis into the structural equation modeling framework.
Talk title: Integrating meta-analysis within the structural equation modeling framework
Abstract: Structural equation modeling (SEM) and meta-analysis are two powerful statistical methods in the educational, social, behavioral, and medical sciences. Researchers usually treat them as two unrelated topics in the literature. This presentation gives an overview of how many meta-analytic models, such as univariate, multivariate, and three-level meta-analyses, can be integrated under the SEM framework.
Dr. Kevin Grimm -- Arizona State University
Kevin Grimm, Ph.D., is a Professor in the Department of Psychology at Arizona State University. He directs the Health and Developmental Research Methods Laboratory at Arizona State University. Grimm's current research focuses on data integration, the specification of growth models for binary and ordinal outcomes, longitudinal measurement invariance, and the development and application of data mining techniques for psychological science.
Talk title: A Multiple Imputation Approach for Handling Missing Data in Decision Trees
Abstract: Decision trees (DTs) is a machine learning technique that searches the predictor space for the variable and value that leads to the best prediction when the data are partitioned based on the variable and splitting value. The algorithm repeats its search within each partition of the data until a stopping rule ends the search. Missing data can be problematic in DTs because the algorithm cannot place an observation with a missing value on the chosen splitting variable. Moreover, missing data can alter the selection process because of its inability to place such observations. Simple missing data approaches (e.g., listwise deletion, majority rule, and surrogate split) have been implemented in DT algorithms; however, more sophisticated missing data techniques have not been thoroughly examined. A modified multiple imputation approach is proposed to handling missing data in DTs, and we compare this approach with listwise deletion, delete if selected, majority rule, surrogate splits, and single imputation via Monte Carlo Simulation. The proposed multiple imputation approach and surrogate splits showed superior performance with respect to prediction accuracy, variable selection, tree size. The proposed multiple imputation approach performed best in the severe MAR conditions (e.g., strong associations among predictors, multiple predictors of missing values, small sample sizes, etc.), whereas surrogate splits performed best in MCAR or mild MAR conditions (e.g., weak associations among predictors, etc.).
Dr. Qiwei He -- Educational Testing Service
Dr. Qiwei (Britt) He is a Research Scientist for the National and International Assessment at Educational Testing Service (ETS), where she helps oversee research projects in international large-scale assessments such as PISA and PIAAC. Besides applying advanced techniques such as text mining and IRT in assessments, she has broadened her research to focus on developing new methods in analyzing big data, such as process data in log files, and to understand individuals' behavior during learning and testing.
Talk title: Leveraging Process Data in Large-Scale Assessments with Sequence Mining
Abstract: The increasing availability of data in computer-based learning and assessment environments brings a great opportunity to track big data in getting a deeper understanding about people’s behavioral patterns and cognitive process. These new data sources, in particular finer-grained process data, are often in complex and multidimensional form that would need to be analyzed with an integration of data-driven analytic approaches in addition to classical psychometric models. This talk presents recent explorations in process data analysis with sequence and text mining techniques and illustrate how to leverage process data in international large-scale assessments (e.g., PISA and PIAAC) to assist in understanding how respondents interact with the items administered, thus support test construction, improve validity of conclusions, and facilitate cross-national comparisons.
Dr. David Hunter -- Pennsylvania State University
Dr. David Hunter is a Professor of Statistics at The Pennsylvania State University. Dr. Hunter has published widely on statistical models for networks and is a co-creator of the statnet suite of packages for network analysis in R. He co-proposed the MM algorithms and has written extensively on this and other EM-like algorithms. He has also extended the theory and computational practice of unsupervised clustering using nonparametric finite mixture models.
Talk title: Modeling Homophily in ERGMs for Bipartite Networks
Abstract: Bipartite networks, in which there are two disjoint sets of nodes and edges are only allowed that connect one set with the other, represent an important tool for modeling processes such as affiliations, collaborations, and co-location. Frequently, we would like to model the propensity of similar nodes to form links among themselves, a property referred to as homophily. This talk discusses homomphily models in the context of exponential-family random graph models (ERGMs). Ordinarily these models are straightforward, but in a bipartite network they become complicated due to the prohibition of direct ties between nodes of the same type. We discuss a novel method for modeling homophily in this framework and illustrate its use.
Dr. Nilam Ram -- Stanford University
Dr. Nilam Ram is a Professor in the Departments of Communication and Psychology at Stanford University. Nilam’s research grows out of a history of studying change. His current projects include examinations of age-related change in children’s self- and emotion-regulation; patterns in minute-to-minute and day-to-day progression of adolescents’ and adults’ emotions; and change in contextual influences on well-being during old age. He is developing a variety of study paradigm that use recent developments in data science and the intensive data streams arriving from social media, mobile sensors, and smartphones to study change at multiple time scales.
Talk title: Screenomics: A New Venue for Discovering the Dynamics of Digital Life through Mining and Modeling of “Big Data” As “Small Data”
Abstract: We have recently developed and forwarded a new approach for capturing, visualizing, and analyzing the unique record of an individual’s everyday digital experiences – screenomics. In our quest to derive knowledge from and understand screenomes – ordered sequences of hundreds of thousands of smartphone and laptop screenshots obtained every five seconds for up to one year – the data have become a playground for learning about computational machinery used to processes images and text, machine learning algorithms, human-labeling of taxonomies, qualitative inquiry, and the tension between N = 1 and N = many approaches. Using a selection of empirical examples, we illustrate how engagement with these new data is reshaping what we know about behavioral change in a wide variety of domains and how we study the person-context transactions that drive individuals’ digital lives.
Dr. Doug Steinley -- University of Missouri
Doug Steinley, Ph.D., is a Professor at the University of Missouri. His research focuses on multivariate statistical methodology, with a primary interest in cluster analysis and social network analysis. His research in cluster analysis focuses on both traditional cluster analytic procedure (e.g., k-means cluster analysis) and more modern techniques (e.g., mixture modeling). In that the formulation of the general partitioning problem can be thought of in a graph theoretic nature, his research also involves combinatorics and social network analysis.
Talk title: TBA
Dr. Matthew Wilkens -- Cornell University
Dr. Matthew Wilkens is an Associate Professor of information science at Cornell University. He uses quantitative and computational methods to study large-scale developments in literary and cultural history. His work has focused in particular on literary text mining, geolocation extraction, genre detection, and the cross-pollination of critical and social-scientific methods. He is the director of the Textual Geographies project, a co-investigator of the Text Mining the Novel project, a founding editorial board member of the Journal of Cultural Analytics, and the author of Revolution: The Event in Postwar Fiction.
Talk title: Data Science in the Humanities
Abstract: Humanities disciplines like literature, history, and philosophy aren't the first things people think of when their minds turn to data science. But these fields stand to benefit from data-driven methods, and the challenging problems that humanists explore could be of great interest to researchers in data science and computational social science. This talk presents recent examples of large-scale, data-intensive humanities work, covering problems such as literary genre detection, spatial evolution in books and newspapers, and the use of generative language models to assess textual novelty. It also offers some advice for data scientists, drawn from the notoriously complex questions that the humanities seek to answer.
Dr. Jerry Jiun-Yu Wu -- National Yang Ming Chiao Tung University
Dr. Jerry Wu is currently Professor at National Yang Ming Chiao Tung University, Taiwan. He is a quantitative methodologist specializing in Multilevel Structural Equation Modeling (MSEM) with cross-sectional and longitudinal data. His research interests focus on students' online reading behavior and performance as well as factors that motivate or hinder students' selective attention during online learning.
Talk title: Learning Analytics within the technology-enhanced environment
Abstract: In the age of data deluge, people’s digital traces, such as log files, discourse, and interaction data, bring unparalleled potential to examine their learning from different facets. Growing interest has given to the development and use of advanced learning technology and social media to support Learning Analytics. In this talk, Dr. Jiun-Yu Wu will introduce a series of Human-centered Learning Analytics studies using data mining, supervised machine learning, and social network analysis techniques. These studies will show how learning analytics can be applied to monitor students’ learning progress, explore their interactions among peers and artifacts, and identify students at risk of failure within the Personal Learning Environment (PLE) premised on social media. The analytical findings will be discussed in line with the theoretical support and pedagogical design to build an effective learning environment for facilitating students’ learning in the post-pandemic era.
If you registered before April 15, you should have received Zoom information for attending the meeting through Zoom. Anyone can view the meeting through Youtube Live.
|8:15-8:30||Organizing committee||Welcome & Introduction|
|8:30-9:30||Dr. Jerry Jiun-Yu Wu National Yang Ming Chiao Tung University||Learning Analytics within the technology-enhanced environment|
|9:30-10:30||Dr. Mike Cheung National University of Singapore||Integrating meta-analysis within the structural equation modeling framework|
|10:30-11:30||Dr. David Hunter Pennsylvania State University||Modeling Homophily in ERGMs for Bipartite Networks|
|11:30-12:30||Dr. Kevin Grimm Arizona State University||A Multiple Imputation Approach for Handling Missing Data in Decision Trees|
|12:30-13:30||Dr. Nilam Ram Stanford University||Screenomics: A New Venue for Discovering the Dynamics of Digital Life through Mining and Modeling of “Big Data” As “Small Data”|
|13:30-14:30||Dr. Qiwei He Educational Testing Service||Leveraging Process Data in Large-Scale Assessments with Sequence Mining|
|14:30-15:30||Dr. Matthew Wilkens Cornell University||Data Science in the Humanities|
|15:30-16:30||Dr. Doug Steinley University of Missouri||TBA|
|16:30-16:45||Organizing committee||Closing Remarks|
Workshop on Statistical Power Analysis
A pre-conference workshop (through Zoom) on statistical power analysis will be taught by Prof. Johnny Zhang from the University of Notre Dame. Prof. Zhang is the PI of the IES funded project to develop methods and software for statistical power analysis.
Johnny Zhang -- University of Notre Dame
Dr. Johnny Zhang is a Professor in Quantitative Psychology at the University of Notre Dame. His research aims to develop better statistical methods and software in the areas of education, health, management and psychology. He has conducted research in the areas of Bayesian methods, Big data analysis, Structural equation modeling, Longitudinal data analysis, Mediation analysis, and Statistical computing and programming. His most recent research involves the development of new methods for social network and text analysis.
This workshop teaches researchers how to conduct a statistical power analysis to determine the sample size for a planned study. Participants will learn statistical power analysis for t-test, one-way, and two-way ANOVA, linear regression, mediation analysis, and structural equation modeling. Free software WebPower (https://webpower.psychstat.org) will be used to illustrate how to conduct power analysis in practice.
Time: 8:30AM-12:00PM (US Eastern Time) June 4, 2021
Location: Zoom (link will be sent to your registration email)
The registration fee ranges from free to $99. All revenue from the workshop will be used for the operation of the meeting and ISDSA. You will be registered for the ISDSA meeting automatically along with your workshop registration.
Undergraduate and graduate students: $49
Non-students (e.g., faculty and staff): $99
If you need financial support, simply name your price anywhere between $1 and $99.
For free registration, please register for the meeting and mention you want to attend the workshop.
For any reason that prevents you from attending the workshop, you can request for a refund by contacting us. The refund amount depends on the time you make the request.
Before May 1: Refund full registration fee minus 5% of processing fee (charged by the third-party)
After May 1: before June 1: Refund 80% of full registration fee.
After June 1: no refund.
To register, please select the registration fee and click "Pay now".
Free Meeting Registration
Registration to attend the virtual meeting is free but required for planning purpose. We will send you the Zoom information through email, so make sure your email address is correct.
Update on April 3, 2021
Registration is still open. But because of the limit of 300 participants on Zoom, anyone who registers after April 3 will be provided other ways than Zoom to participate in the meeting. Information will be sent to your email on file.
Frequently Asked Questions
The list is being updated.
- Does the meeting accept paper submission?
No. ISDSA 2021 limits to the invited speakers. We welcome you to submit your papers to our meeting in 2022.
- How much does it cost to attend the meeting?
Attending the meeting is free. However, we ask you to register to receive information about the meeting.
- How do I know my registration is successful?
If your registration is successful, you will receive the information "Thank you for your registration!" in your web browser.
If you pay for the workshop registration, you will also receive an email with your receipt.
- How to register for the workshop for free?
Given the pandemic, we understand you might not have funds to support yourself to attend the workshop. Therefore, we provide the free option. You can simply register for the meeting with a comment to state that you plan to attend the workshop.
- When to receive the virtual meeting information for the meeting and workshop?
You should receive the Zoom meeting information by May 20, 2021, in your email.
- Any idea about the meeting in 2022?
For now, we plan to hold an in-person meeting in China, mostly like in June 2022. To receive the information on the meeting, please subscribe to our mailing list. You can also bookmark this website to come back for information.