
Data Analytics
Codes and snippets of my work in Data Analytics
02
Database
-
Clean and update database records for correct zip format and date of loss on claims
-
Filter problematic last names and zip codes for claims, and update database with fix
-
Fix State given address, check sum of charges, verify facility name given NPI, and update database
-
Insert and update records info with other tables accounting for exact duplicates
-
Mask confidential fields in JSON and store processed records in database
-
Store ASC X12 alerts automatically cross verify status with other tables
-
Store records of already masked claims with auto-generated script to avoid future duplication
-
Use data warehouse to fill in missing ICD code descriptions and update records stored

03

Extraction
-
Filter and retain only records with JSON fields satisfying certain conditions
-
Fix State given address check sum of charges verify facility name given NPI and update database
-
Removing JSON records with conditions specified by input file
-
Select and analyze only ASC X12 claim response codes requiring action
-
Verify address and tax ID on W-9 with OCR after cleaning image
04
Cleaning
-
Add missing ICD code descriptions through API and convert ICD-9 to ICD-10 codes
-
Clean and update database records for correct zip format and date of loss on claims
-
Filter and retain only records with JSON fields satisfying certain conditions
-
Fix State given address, check sum of charges, verify facility name given NPI, and update database
-
Generate different group-by reports after cleaning and integrating data from different tables
-
Pad missing values by multiple imputation to machine learn recidivism rate for probationers
-
Pad missing values by multiple imputation to machine learn Real Estate Analysis NYC
-
Pick relevant variables and period for mask mandates and group data into useful segments
-
Remove stopwords to match insurance names from different sources
-
Removing JSON records with conditions specified by input file
-
Verify address and tax ID on W-9 with OCR after cleaning image

05

Integration
-
Insert and update records info with other tables accounting for exact duplicates
-
Merge CSV data sets by creating and joining temporary database tables
-
Show number of records processed while displaying current DateTime in specified timezone
-
Transform old JSON v1 to new format with Python for database
-
Use data warehouse to fill in missing ICD code descriptions and update records stored
06
Analytics
-
Check only specified fields in JSON for error and find which customer
-
Find discrepancies between records loaded into the database and customer files
-
Find overlaps between CSV and database claims based on select variables
-
Remove stopwords to match insurance names from different sources
-
Select and analyze only ASC X12 claim response codes requiring action
-
Uncover data issues in matching external vendor requirements or specifications
