SUMMARY:
The Data Science team at AF Group, a commercial Property & Casualty insurer, is seeking an intern who's excited to get hands-on experience using machine learning, statistics, and natural language processing to solve complex business problems and help deliver valuable insights to our claims and pricing business partners.
JOB DESCRIPTION:
Perform exploratory data analysis and feature engineering.
Resolve data-related challenges such as outliers, missing data, imbalanced target, etc.
Build predictive models using supervised and unsupervised learning approaches on structured and unstructured data sets.
Train language models to extract information from notes and scanned documents.
Conduct hyper-parameter tuning and feature selection.
Perform A/B testing.
Provide explainable results on black-box models using SHAP, LIME, or similar techniques.
Evaluate model performance and create exhibits such as lift charts.
Learn about risk, pricing, claims, or actuarial science.
Gain an understanding of insurance and how the business works.
Present work completed during sprint reviews.
EMPLOYMENT QUALIFICATIONS:
Have status as either graduate student or third year undergraduate by the end of the spring term.
Completed at least five courses related to data science, machine learning, statistical modeling, or optimization.
Hold a cumulative GPA of 3.5 or better as of the most recent grading period.
Be able to work full-time during normal business hours for this summer with a start date between mid-May and mid-June.
EDUCATION OR EQUIVALENT EXPERIENCE:
Should hold or be pursuing a bachelor's or advanced degree in data science, statistics, mathematics, operations research, engineering, physics, actuarial science, or similar quantitative field.
EXPERIENCE:
With proper education and projects (either personal or scholastic), no prior work experience necessary.
SKILLS/KNOWLEDGE/ABILITIES (SKA) REQUIRED:
Comfortable programming in Python.
Knowledge of SQL with ability to join multiple tables.
Knowledge and experience with all of the following:
Linear and logistic regression
Gradient-boosted decision trees e.g., LightGBM or similar
Neural network
K-means clustering
Knowledge of any two or more of the following:
Causality modeling: regression discontinuity, matching (e.g. propensity score), meta-learners, etc.
Natural language processing: TF-IDF, LDA, word2Vec, BERT, LLM, etc.
Time series modeling using either linear model (e.g. ARIMA) or state-space model (e.g. Kalman filter) or neural network model (e.g. LSTM)
Advanced clustering methods: T-SNE, Gaussian Mixture, or UMAP
Graph data mining and network science
Bayesian linear modeling
Collaborative filtering or low rank models
Mixed-effect modeling
Reinforcement learning
Linear programming using simplex or similar methods
Stochastic processes and Markov chains
Experience with Git repos for version control.
Comfortable with cloud computing platforms like Azure, AWS, GCP.
WORKING CONDITIONS:
Work is performed in an office setting or virtual with no unusual hazards.
The qualifications listed above are intended to represent the minimum education, experience, skills, knowledge, and ability levels associated with performing the duties and responsibilities contained in this job description.
Actual compensation decision relies on the consideration of internal equity, candidate's skills and professional experience, geographic location, market, and other potential factors. It is not standard practice for an offer to be at or near the top of the range, and therefore a reasonable estimate for this role is between $18 and $33.
We are an Equal Opportunity Employer. We will not tolerate discrimination or harassment in any form. Candidates for the position stated above are hired on an \"at will\" basis. Nothing herein is intended to create a contract.
Job #638714