Mark Howison

An experienced data scientist.

Originally trained as a computer scientist, my career has led to multidisciplinary partnerships with economists, doctors, and policy makers to solve challenges in human health and public policy. Currently, I am a Senior Applied Scientist at Amazon (all opinions here are my own) working on employee experience. I also serve on the board of Research Improving People's Lives, a tech-for-social-impact nonprofit that helps governments use data, science, and technology to equitably improve policy and lives. I received my M.S. in Computer Science from UC Berkeley.


Projects


Pandemic Unemployment Assistance

I led the rapid development and deployment of Rhode Island's emergency unemployment assistance system in response to the COVID-19 economic shutdown. My team launched this production system in just 10 days following the passage of the US CARES Act. It accepted over 450,000 applications for benefits during its operation, accommodating a surge of 10,000 applications in the first 12 hours following launch.


Connecting Workers to New Careers

Through partnerships with state governments across the country, I led the development of recommendation systems that help job seekers discover new careers. The underlying algorithm uses causal machine learning methods to identify career transitions that resulted in increased earnings and employment for previous job seekers, and measures of skill similarity derived from natural-language processing of millions of full-text job descriptions.


Monitoring COVID-19 Variants

I developed the bioinformatics pipeline used to monitor the emergence and prevalence of COVID-19 variants in Rhode Island. Regular reports from this system provided the Rhode Island Department of Health and Governor's Office with information to guide their public health response to COVID-19.


Predicting Opioid Dependence

As many as 80% of those suffering with an opioid use disorder had a legitimate opioid prescription from a doctor prior to their diagnosis. I helped develop a predictive model for the risk of developing a disorder if given an opioid prescription. We are partnering with policymakers to provide this information to doctors when weighing the risks and benefits of opioid therapy for new patients.


The Rhode to College

I served as the technnology director for Rhode2College, an innovative program announced on Sep 24, 2018 by Governor Gina Raimondo to help Rhode Island high school students succeed on their path to college.


Big Data for Policy Innovation

I led a data science team that integrated over 800 data sets from Rhode Island government agencies into an anonymized and secure database for delivering policy insights.


Disrupting HIV Transmission

The actual transmission network between HIV-infected individuals is unknown, but gene sequencing of new infections can reveal patterns of transmission. I served as the lead bioinformatician on an NIH-funded project to use these patterns to help public health officials disrupt HIV transmission.


Data Mining

Data in the real world is messy and not always ready for analysis. I have developed computer-vision methods to extract historical data on changes in industrial land use from printed directories, natural language processing methods to extract occupation and skills from job postings and resumes, and a comprehensive directory of FDA drug codes for understanding prescriptions in medical claims.


Measuring HIV Drug Resistance

Modern gene sequencing technologies can monitor HIV infections with high precision and have the potential to improve and personalize drug therapies for treating HIV. I have studied methods for measuring drug resistance in HIV-infected individuals and worked with an international group of HIV researchers to recommend future standards for clinical applications.


Technology for Research

I understand the technological needs of researchers and can translate them into IT solutions. I was the technical architect for a secure computing environment at Brown University that served over 150 researchers across 17 labs and centers in fields such as public policy, economics, public health, and biomedical informatics. I have also conducted performance studies on large computing clusters and developed methods for managing research software and tracking complex analyses of big data.


Publications

2024

Howison M, Angell M, Hastings JS. 2024. Protecting Sensitive Data with Secure Data Enclaves. Digital Government: Research and Practice, in press. doi: 10.1145/3643686

2023

Novitsky V, et al. 2023. Added Value of Next Generation Sequencing in Characterizing the Evolution of HIV-1 Drug Resistance in Kenyan Youth. Viruses 15(7): 1416. doi: 10.3390/v15071416

Dixon N, et al. 2023. Occupational models from 42 million unstructured job postings. Patterns 4(7): 100757. doi: 10.1016/j.patter.2023.100757

Howison M, et al. 2023. An Automated Bioinformatics Pipeline Informing Near-Real-Time Public Health Responses to New HIV Diagnoses in a Statewide HIV Epidemic. Viruses 15(3): 737. doi: 10.3390/v15030737

Novitsky V, et al. 2023. Not all clusters are equal: Dynamics of molecular HIV-1 clusters in a statewide Rhode Island epidemic. AIDS 37(3): 389-399. doi: 10.1097/QAD.0000000000003426

Howison M, Long J. 2023. Recommending Career Transitions to Job Seekers Using Earnings Estimates, Skills Similarity, and Occupational Demand. Available at SSRN: https://ssrn.com/abstract=4371445

2022

Hastings JS, Howison M. 2022. Predicting Divertible Medicaid Emergency Department Costs. Digital Government: Research and Practice 3(3): 19:1–19:19. doi:10.1145/3548692

Singh M, et al. 2022. SARS-CoV-2 Variants in Rhode Island; May 2022 Update. Rhode Island Medical Journal 105(6): 6-11.

Howison M, Goggins M. 2022. SIRAD: Secure Infrastructure for Research with Administrative Data. Software Impacts 12: 100245. doi:10.1016/j.simpa.2022.100245

Steingrimsson JA, et al. 2022. Beyond HIV outbreaks: protocol, rationale and implementation of a prospective study quantifying the benefit of incorporating viral sequence clustering analysis into routine public health interventions. BMJ Open 12(4): e060184. doi:10.1136/bmjopen-2021-060184

Earnest R, et al. 2022. Comparative transmissibility of SARS-CoV-2 variants delta and alpha in New England, USA. Cell Reports Medicine 3(4): 100583. doi:10.1016/j.xcrm.2022.100583

Guang A, et al. 2022. Incorporating Within-Host Diversity in Phylogenetic Analyses for Detecting Clusters of New HIV Diagnoses. Frontiers in Microbiology 12: 803190. doi:10.3389/fmicb.2021.803190

Munro C, et al. 2022. Evolution of Gene Expression across Species and Specialized Zooids in Siphonophora. Molecular Biology and Evolution 39(2): msac027. doi:10.1093/molbev/msac027

Novitsky V, et al. 2022. Statewide Longitudinal Trends in Transmitted HIV-1 Drug Resistance in Rhode Island, USA. Open Forum Infectious Diseases 9(1): ofab587. doi:10.1093/ofid/ofab587

2021

Beckwith CG, et al. 2021. HIV Drug Resistance and Transmission Networks Among a Justice-Involved Population at the Time of Community Reentry in Washington, D.C. AIDS Research and Human Retroviruses 37(12): 903-912. doi:10.1089/aid.2020.0267

Howison M, et al. 2021. Protecting Sensitive Data with Secure Data Enclaves. OSF Preprints: jmd7t. doi:10.31219/osf.io/jmd7t

Angell M, et al. 2021. Estimating Value-added Returns to Labor Training Programs with Causal Machine Learning. OSF Preprints: thg23. doi:10.31219/osf.io/thg23

Novitsky V, et al. 2021. Longitudinal typing of molecular HIV clusters in a statewide epidemic. AIDS 35(11): 1711-1722. doi:10.1097/QAD.0000000000002953

Kantor R, et al. 2021. SARS-CoV-2 Variants in Rhode Island. Rhode Island Medical Journal 104(7): 50-54.

Guang A, et al. 2021. Revising transcriptome assemblies with phylogenetic information. PLOS ONE 16(1): e0244202. doi:10.1371/journal.pone.0244202

2020

Novitsky V, et al. 2020. Empirical comparison of analytical approaches for identifying molecular HIV-1 clusters. Scientific Reports 10(1): 18547. doi:10.1038/s41598-020-75560-1

Kantor R, et al. 2020. Challenges in evaluating the use of viral sequence data to identify HIV transmission networks for public health. Statistical Communications in Infectious Diseases 12(s1). doi:10.1515/scid-2019-0019

Angell M, et al. 2020. Delivering Unemployment Assistance in Times of Crisis. Digital Government: Research and Practice 2(1): 5:1-5:11. doi:10.1145/3428125

Parkin NT, et al. 2020. Multi-Laboratory Comparison of Next-Generation to Sanger-Based Sequencing for HIV-1 Drug Resistance Genotyping. Viruses 12(7): 694. doi:10.3390/v12070694

Hastings JS, Howison M, Inman SE. 2020. Predicting high-risk opioid prescriptions before they are given. Proceedings of the National Academy of Sciences 117(4): 1917-1923. doi:10.1073/pnas.1905355117

2019

Hastings JS, et al. 2019. Unlocking data to improve public policy. Communications of the ACM 62(10): 48-53. doi:10.1145/3335150

Berenbaum D, et al. 2019. Mining Spatio-temporal Data on Industrialization from Historical Registries. Journal of Environmental Informatics 34(1): 28-34. doi:10.3808/jei.201700381

Howison M, Coetzer M, Kantor R. 2019. Measurement error and variant-calling in deep Illumina sequencing of HIV. Bioinformatics 35(12): 2029-2035. doi:10.1093/bioinformatics/bty919

2018

Munro C, et al. 2018. Improved phylogenetic resolution within Siphonophora (Cnidaria) with implications for trait evolution. Molecular Phylogenetics and Evolution 127: 823-833. doi:10.1016/j.ympev.2018.06.030

Ji H, et al. 2018. Bioinformatic data processing pipelines in support of next-generation sequencing-based HIV drug resistance testing: the Winnipeg Consensus. Journal of the International AIDS Society 21(10): e25193. doi:10.1002/jia2.25193

2017

Howison M, Bethel EW. 2017. GPU-accelerated denoising of 3D magnetic resonance images. Journal of Real-Time Image Processing 13(4): 713-724. doi:10.1007/s11554-014-0436-8

2016

Guang A, et al. 2016. An Integrated Perspective on Phylogenetic Workflows. Trends in Ecology & Evolution 31(2): 116-126. doi:10.1016/j.tree.2015.12.007

2015

Zapata F, et al. 2015. Phylogenomic Analyses Support Traditional Relationships within Cnidaria. PLOS ONE 10(10): e0139068. doi:10.1371/journal.pone.0139068

Bethel EW, et al. 2015. Improving Performance of Structured-Memory, Data-Intensive Applications on Multi-core Platforms via a Space-Filling Curve Memory Layout. In Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp. 565-574, 25-29 May 2015, Hyderabad, India. doi:10.1109/IPDPSW.2015.71

Howison M, Shen A. 2015. Bioinformatics Brew: the cross-platform package manager for open-source bioinformatics tools. Poster presented at Bio-IT World, April 21-23, Boston, MA, USA. doi:10.7301/Z0Z60KZD

2014

Zapata F, et al. 2014. Phylogenomic analyses of deep gastropod relationships reject Orthogastropoda. Proceedings of the Royal Society B: Biological Sciences 281(1794): 20141739. doi:10.1098/rspb.2014.1739

Howison M, et al. 2014. Bayesian Genome Assembly and Assessment by Markov Chain Monte Carlo Sampling. PLOS ONE 9(6): e99497. doi:10.1371/journal.pone.0099497

Howison M, Zapata F, Dunn CW. 2013. Toward a statistically explicit understanding of de novo sequence assembly. Bioinformatics 29(23): 29592963. doi:10.1093/bioinformatics/btt525

2013

Dunn CW, Howison M, Zapata F. 2013. Agalma: an automated phylogenomics workflow. BMC Bioinformatics 14(1): 330. doi:10.1186/1471-2105-14-330

Howison M, Shen A, Loomis A. 2013. Building Software Environments for Research Computing Clusters. In Proceedings of the 27th Large Installation System Administration Conference (LISA '13), 3-8 November 2013, Washington, DC, USA.

Howison M. 2013. High-throughput compression of FASTQ data with SeqDB. IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(1): 213-218. doi:10.1109/TCBB.2012.160

2012

Bethel EW, Howison M. 2012. Multi-core and many-core shared-memory parallel raycasting volume rendering optimization and tuning. International Journal of High Performance Computing Applications 26(4): 399-412. doi:10.1177/1094342012440466

Howison M, Sinnott-Armstrong NA, Dunn CW. 2012. BioLite, a lightweight bioinformatics framework with automated tracking of diagnostics and provenance. In Proceedings of the 4th USENIX Workshop on the Theory and Practice of Provenance (TaPP '12), 14-15 June 2012, Boston, MA, USA.

Howison M, Bethel EW, Childs H. 2012. Hybrid Parallelism for Volume Rendering on Large-, Multi-, and Many-Core Systems. IEEE Transactions on Visualization and Computer Graphics 18(1): 17-29. doi:10.1109/TVCG.2011.24

2011

Howison M, et al. 2011. The Mathematical Imagery Trainer: From Embodied Interaction to Conceptual Learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1989-1998, 7-12 May 2011, Vancouver, BC, Canada. doi:10.1145/1978942.1979230

2010

Howison M, et al. 2010. H5hut: A High-Performance I/O Library for Particle-based Simulations. In Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS '10), 20-24 Sept. 2010, Heraklion, Crete, Greece. doi:10.1109/CLUSTERWKSP.2010.5613098

Howison M, et al. 2010. Tuning HDF5 for Lustre File Systems. In Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS '10), 20-24 Sept. 2010, Heraklion, Crete, Greece.

Childs H, et al. 2010. Extreme Scaling of Production Visualization Software on Diverse Architectures. IEEE Computer Graphics and Applications 30(3): 22-31. doi:10.1109/MCG.2010.51

Uselton A, et al. 2010. Parallel I/O performance: From events to ensembles. In Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing, 19-23 April 2010, Atlanta, GA, USA. doi:10.1109/IPDPS.2010.5470424

2009

Howison M, Séquin CH. 2009. CAD Tools for the Construction of 3D Escher Tiles. Computer-Aided Design and Applications 6(6): 737-748. doi:10.3722/cadaps.2009.737-748