G Credit Union (GCU) is currently undertaking several large efforts to integrate its disparate operational systems and at the same time, implement a company-wide CRM system. GCU is a large credit union with over $2B in assets and 100,000 members. GCU recently acquired another credit union and had not yet fully integrated the new customers and systems into the GCU systems.
Censcia Services
Censcia was brought into the project by a marketing services company to provide data analysis services. The marketing services firm was familiar with Censcia capabilities through previous work for a different financial institution.
Goals
The goals for the Data Discovery Session were to load the operational data from the two existing legacy systems and provide detailed analysis, especially the segmentation of customers by various factors including: Wealth, Loyalty, Life Stage, Profitability, Share of Wallet, and Present Products. Each segment then has a defined action plan of financial services tailored to meet their particular needs.
The first bit of information in any exercise similar to this is determining the accessibility of the necessary data. A mere request to provide a copy of the data for analysis answers many questions.
The data files from GCU included:
- Customer extracts
- Demand account files (Savings and Checking)
- Term deposits (CD’s)
- RSP Extract (IRA’s)
- Loans Extract (included personal loans, mortgages, business loans)
- Mutual Funds (spreadsheet with many variables at member level)
- Loan History File (Loan originations or applications)
- Card File (ATM cards)
- Safe Deposit Box File
Typcial Data Access questions:
- How is the data stored in the operational system?
- Who has access to this data?
- Is this data currently analyzed?
- Who can provide extracts of the data?
- What kind of documentation exists on the data (data dictionary, file layouts, etc.)?
Once the data was loaded, the data quality was quickly assessed. Censcia was able to apply pre-built best practices tools to rapidly provide counts and basic statistics on each field. These results show how many values exist in each field, the percentage of total for each value, etc.
Typical Data Quality Issues:
- What is the population level of each field?
- Invalid address information? (ie, 120 different values in the State column)
- Are there misspellings? (ie, 200 variations of North Las Vegas)
- Are users usurping the system? (ie, email addresses in the fax column)
- Are fields being combined? (ie, middle initial in first name field, state included in city field)
Keys/relationships Reviewed:
- What is the key column in each table?
- Is this key truly unique?
- How does this table link to other tables?
- What are the key fields?
- Are these other key fields unique?
- What uniquely defines a customer?
- Do customers have multiple accounts?
- Multiple memberships?
- Are customers householded in any manner?
Basic Statistics/Discovery
- Show the breakdown of values within each field.
- Is any value greater than 80% of the total values?
- Where are the customers from?
- What are the date ranges for the transaction file?
- Are my customers male or female?
- Show the age distribution of my customers.
The analysis then proceeds to an aggregation stage where formulas and calculations are applied to the data. Again, Censcia tools provide a catalyst for this work by allowing a skilled analyst to rapidly provide calculations and analytics.
Segmentation
- By Wealth (add balances in each deposit product and subtract out loans and mortgages)
- By Life Stage (use age ranges or combine with marital status, number of children, and home ownership)
- By Profitability (if value provided or else create formula to calculate based on transaction history and product profitability)
- By Number of Products (append customer numbers and balances from each product file and count the number of customer number appearances)
- By Loyalty (loyalty can be calculated several ways – years as a customer or number of products or number of transactions or a combination of the above)
- By Share of Wallet (determine primary financial institution, accounts and balances in other FI’s)
KPI’s
- Loan to Deposit Ratios (Accumulate balances for all deposit products and loan products and provide ratios, overall and by branch)
- Retention Numbers (number of accounts closed by branch per month, number of lost customers by branch per month)
- Products per Member (averages for various life stages, other segments)
- Growth of Portfolio (deposits and loans by month by branch, PAM, or individual member)
- Total Connection for each Personal Account Manager (each PAM was supposed to have 300 members assigned)
- Total Deposits and Loans for each Branch
Identification of Missing Data
The process of analyzing the data in each of the areas identified above reveals what data items are missing that could provide valuable insight. This may lead to further data requests (such as transaction data), analysis of alternate sources of data (home loan application data for Net Worth or share of wallet information) or the recommendation to purchase available 3rd party data for demographic and financial information. GCU had made the decision several years ago not to pull extracts of transaction data from the operational systems for analysis due to the large sizes. This data is necessary to provide much of the desired analysis including large deposits or withdrawals, automatic deposit information, automatic payment information, service fees used in profitability calculations, number of transactions per month used in loyalty or debit card rewards programs, ATM usage, online banking usage, debit card usage, etc. A wealth of data for analysis.
Today’s cheap disk space costs make analysis of these large files relatively inexpensive.
Examples of What Censcia Uncovered at GCU
- Numerous data quality issues uncovered that could impact the successful implementation of the CRM project (ie., over 300 different versions of North Las Vegas, NV) numerous problems with location of first name and middle names in various name fields, both of which impact the householding formulas). The VP of Sales and Service wanted to create a poster of the variations of North Las Vegas for all the branches to see.
- Issues with uniqueness of key fields – duplicate member records in the customer file for the same member (one person had 27 different records in the customer file) and duplicate accounts in the various product files for each member (by Social Security Number and by Customer Number)
- Inconsistent date formats between the various disparate data sources
- Issues with how PAM’s were assigned to a target group of 300 customers (some PAM’s had as many as 1,300 members assigned to them)
- The VP of Marketing mentioned that filtering out deceased members when generating campaign lists was a big time-consuming task and caused many PR headaches when not done properly. Censcia used regular expression searching to identify all records marked deceased in various methods and in various columns and provided a list to Marketing for correction. By identifying these folks, Censcia not only corrected a large time-consuming task, but was able to identify a marketing opportunity for beneficiaries of these accounts.
- Identification of exact number of demand account closings per month (current system only provided net reports and the VP of Finance was quite alarmed).
- Term deposits, Loans, and Mortgages maturing by month revealed major spikes in one month of each year. Made recommendations to balance workload at branches for these.
- Separate reporting for open vs. closed accounts, retail vs. commercial customers.
- Divided Demand accounts into checking vs. saving, loans file into loans vs. mortgages.
- Segmentation by numerous combinations and permutations
Benefits of Censcia Data Discovery Sessions
Censcia has the capability to access data and provide analysis in very short timeframes. There is very little participation needed from clients other than providing the data extracts and some sort of data dictionary. It does not create a burden on IT folks or data analysts familiar with current systems. A single consultant was able to perform all of the tasks mentioned above in 5 days.
There were additional benefits of having the data loaded and available during further discussions with clients regarding business requirements for systems design or software implementations. No question is left unanswered. The data is available and questions regarding the current systems or data can be answered immediately. The data may be displayed on the screen so that everyone can review data issues or results of queries and provide instant feedback or identify further analysis needed. This interaction typically becomes instantly gratifying for the marketing analysts involved. These sessions become major information gathering and sharing sessions – highly productive business requirements and analysis meetings.