A little about me

I am a computational pathology postdoctoral fellow at the University of Texas MD Anderson Cancer Center. I received my doctorate in Computational Statistics from the joint doctoral program at San Diego State University and Claremont Graduate University. My dissertation work focused on designing and implementing multiple Random Forest-based algorithms to solve computational challenges in autism research. I also received my first Masters degree in Mathematical Statistics from Shiraz University in Iran and my second Masters in Biostatistics from San Diego State University.
My research interests include Machine Learning, Data Mining, and Spatial Analysis. Currently, I am working on developing a spatial pipeline to analyze CODEX images on Glioblastoma.

Research Projects

Matching Methods for Observational Data With Small Group Sizes

Research Question: Which matching algorithm should we use to obtain desired balance between two groups while maintaining enough subjects in each group?
In the observational studies, matching is used to optimize balance and sample size. Although many matching algorithms can achieve this goal, in some fields, matching could face its own challenges. Datasets with small sample sizes and limited numbers of control reservoir are prone to this issue. This problem may arise in many ongoing research studies, including those in autism spectrum disorders (ASDs). In this project, we are interested in eliminating the effect of undesirable variables using two types of algorithms, 1:k nearest matching and full matching.

"IterMatch", Statistical R Package for Multivariate Matching Problems

Research Question: How to optimally match two groups of participants on multiple variables with missing data?
Matching two groups on multiple confounding variables is one of the primary steps in conducting observational studies. All existing matching algorithms can function only with a complete dataset, while none can function with missing data. Datasets with these problems must impute or only complete cases can be considered in matching. Losing data because of the limitations in a matching algorithm can decrease the power of the study as well as omit important information.
In this project, we introduce the R package iterMatch that tackles these shortcomings. This package finds a one-to-one subsample of the data that is balanced on all matching variables while incorporating missing values in an iterative manner. Random forest is used as a crucial tool to handle missing values when constructing a distance matrix to be fitted to an optimal matching algorithm. We measure the robustness of the matching results by injecting levels of missing values across two medium and large datasets for comparison. More detail is provided in this chapter.

Mixed-Effects Random Forest-based Classification Algorithms for Clustered Data

Research Question: How to build a high accuracy binary classifier using multi-modal imaging modality with multi-site data?
To date, a variety of classification schemes have been proposed, and the accuracy of classification has reached as high as 95 percent for many disorders, including Autism Spectrum Disorder (ASD). However, to build a reliable and robust classification model for ASD, it is necessary to incorporate a large dataset which is often obtained from multi-site imaging data. In addition to the extended sample size, including multiple MRI modalities can increase the coherency of the brain picture. However, two challenges are associated with an extended sample size and multi-modal dataset. The first issue is controlling the source of variation that is imposed by multiple imaging sites. Second, it is necessary to use a dimensional reduction algorithm to cope with the computational complexity of multi-modal data. Controlling the multi-site variability is particularly important as it can be mixed with the heterogeneous nature of ASD to build a robust and accurate classification model.
We addressed both concerns by proposing two mixed-effects random forest-based classification algorithms, applicable to multi-site (clustered) data using rs-fMRI and structural MRI (sMRI) modalities. These algorithms control the random effects of the confounding factor of the imaging site. Additionally, the algorithms internally control the fixed effect of the phenotypic variables such as age while building classification model. Moreover, they eliminate the necessity of utilizing a separate dimension reduction algorithm for high-dimensional data such as functional connectivity in a non-linear fashion.

Diagnostic Classification of ASD Using Multimodal Imaging Data

Research Question: Which imaging modality is more informative when compared to others in classification of ASD?
Despite the numerous studies conducted with different neuroimaging modalities applied to the diagnostic classification of autism spectrum disorders (ASDs) by use of machine learning algorithms, none have assessed whether one MRI modality may be more informative compared to others. In this study, conditional random forest (CRF) was applied to structural (anatomical) MRI, diffusion tensor imaging (DTI), and functional connectivity MRI (fcMRI) data to assess which modality may be more or less informative.

Potential Subtyping Using Behavioral Measurements

Can subtyping ASD participants based on behavioral and diagnostic measures reveal distinct resting state functional connectivity (rs-fc) patterns?
ASDs are characterized by great heterogeneity of symptoms, likely secondary to multiple etiological subtypes of ASDs. Few studies have directly focused on imaging data with the goal of characterizing subtypes of ASDs, but in the study performed by Theo behavioral and neuropsychological data are used with unsupervised machine learning to characterize potential subtypes of ASDs. We are investigating whether different subtypes based on the behavioral data are linked to different resting state functional connectivity profiles.

Publication

Google Scholar.

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Alternate

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Ordered

  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.

Icons

Actions

Table

Default

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Alternate

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Buttons

  • Disabled
  • Disabled

Form