The Risks and Management of Algorithmic Bias in Fair Lending | Wolters Kluwer
  • Insights

  • The Risks and Management of Algorithmic Bias in Fair Lending

    by Britt Faircloth, CRCM, Senior Regulatory Consultant at Wolters Kluwer

    Published November 06, 2019

    As published in ABA Bank Compliance Magazine


    Most of us receive news and other information pushed to us throughout the day, and we also visit Facebook and other social media feeds to seek information. We may think we are searching for news, but what we see is aggregated content served to us, based on what an algorithm predicts we would like. Some algorithms are based on what a person with similar interests and demographics would like to read. Sites such as Facebook and Google base their feeds on algorithms that predict what people are likely to find relevant, based on their specific past history (e.g., searches, likes, and click-throughs) and whatever topics may be available.

    How is it possible that an algorithm can predict what we will decide is relevant? The answer lies in machine learning algorithms (ML) that provide systems the ability to learn automatically and improve from experience without being explicitly programmed. ML algorithms observe data (such as aggregated data), programmers’ instructions, or direct experience (such as users’ clicks), and then look for patterns. The primary aim is to learn automatically without human intervention or assistance and adjust actions accordingly.

    Let’s use Facebook as an example, since many banks advertise here, and there are 2.4 billion monthly active users (www.statista.com/statistics/264810/number-of-monthly-active-facebook-users-worldwide/). Facebook has a total inventory of content that represents a variety of interests. Facebook assigns “signals” to each piece of this content. Signals represent the type of content, the publisher, its age, purpose, and more. These are the factors that advertisers control. Facebook then predicts how likely the user is to have a positive interaction with a content piece, based on the signals it has assigned to the content piece. Then, Facebook determines a “score,” which is the final number assigned to a piece of content based on the likelihood the user will respond positively. In other words, the content scores are comparative with each other, based on the likelihood of a positive response. As a result, the same content could be scored low for one person, and high for another, based on their individual interests and interactions with other content they were previously served.

    ML is a subset of Artificial Intelligence (AI), and it’s the application of AI that has the ability to learn and improve from experience. Artificial Intelligence is the broad concept that machines are able to carry out tasks that incorporate human intelligence, such as visual perception, speech recognition, decision-making and foreign language translations. If your bank is not currently using these technologies, your vendors probably are, and you will be in the future.

    What are the risks of AI and ML?

    In theory, the use of AI and Machine Learning in lending decision making, operations and processing presents an interesting fair lending paradox. Machines, and the algorithms they utilize, have the potential to reduce human bias in lending. After all, they have no gender, race, or human preconceived notions relating to any prohibited basis of discrimination. In that regard, applying AI in the loan underwriting or pricing process has the potential to reduce or eliminate bias or discriminatory actions that are of a more human nature—such as those that arise from discretion.

    However, AI is not completely free of bias. The use of AI in lending activities creates the risk of a different type of bias—algorithmic bias (www.techtimes.com/articles/240769/20190402/ai-perpetuating-human-bias-in-the-lending-space.htm).

    Algorithmic Bias

    Algorithmic bias is not wholly different from human bias. It is a bias that can be intentional or unintentional and can occur from a human programmer, or from past biases or issues inherent in the underlying data. In practice, algorithmic bias has an amplified impact—it’s challenging to diagnose, rapidly spreading, and difficult to shut down.

    A prime example of this comes from Amazon and its usage of AI in a recruiting tool that reviewed incoming resumes to identify the top candidates for possible hiring. While the tool was not given specific instructions relating to gender, the tool was not evaluating resumes in a gender-neutral manner. In fact, the system taught itself that male applicants were preferred, which is believed to be due to the data utilized. The system was trained to review resumes based on patterns of prior resumes submitted to the company. The resumes utilized as training data were from a time when the tech industry was largely dominated by males. Therefore, the data set was heavily skewed and as a result, the tool saw resumes with terms such as “women’s” or those listing an educational background from a women’s college, as being less favorable since those terms were not prevalent in the training data (fortune.com/2018/10/10/amazon-ai-recruitment-bias-women-sexist/).

    Algorithmic Bias with Alternative Data

    Compliance officers should also keep in mind that AI may introduce forms of alternative data to make a credit decision. The 2018 Treasury Department Report, A Financial System That Creates Economic Opportunities: Nonbank Financials, Fintech, and Innovation, lists traditional alternative data points as rental payments, utility payments, employment history, property ownership and address stability as well as nontraditional alternative data, such as social media activity, browsing history, behavioral data, shopping patterns, and data about friends and associates (home.treasury.gov/sites/default/files/2018-08/A-Financial-System-that-Creates-Economic-Opportunities---Nonbank-Financials-Fintech-and-Innovation_0.pdf).

    Machine learning algorithms combined with these large sets of alternative data create opportunities to expand access to credit, however, they also create risk. Incorporating new data raises a bunch of ethical questions. When AI and ML are combined with a large amount of aggregated data, these technologies can find empirical relationships between these new elements and consumer behavior. These relationships could be factored into scoring and then, credit decisions.

    The FDIC published a white paper which demonstrates that customers’ digital footprints can perform better than credit scores on who will pay back a loan (www.fdic.gov/bank/analytical/cfr/2018/wp2018/cfr-wp2018-04.pdf). Data was examined regarding almost 270,000 people, who shopped at an online German furniture company. Then, a handful of variables were analyzed with the following results:

    Borrowers type of device used—Android users were almost twice as likely to default than iPhone users;

    Time of credit application—Those who purchased in the middle of the night had a higher default rate;

    Email domain—Those with numbers in their email are almost twice as likely to default; and

    Where they entered the site—Those who clicked on ads and went through the home page had a higher default rate than whose who searched via comparative sites.

    All of these variables would be available to a lender and they could be analyzed with AI. But doing so can lead to problems. Should iPhone users be offered a lower rate if they are a better risk than Android users with the same age and income? What if these users are disproportionately white? These variables could be correlated with protected classes, and there are gray areas as well that should be illegal. And what about the future of financial technologies? Replacement of credit cards with facial recognition technology is on the horizon because of biometrics capabilities. Another problem is that banks are required to tell their applicants why they were denied credit, but most AI does not provide a trail, creating an added layer of complexity. Certainly, these are questions that require human and legal expertise on disparate impact.

    The Possible Benefits of AI and ML with Alternative Data

    However, alternative data, such as the “digital footprint” analyzed above could potentially expand access to credit to previously unserved, or underserved populations. In working toward that goal, data points such as rental history or utility payments seem to be reasonably low risk. In 2017, the Consumer Financial Protection Bureau (CFPB) issued a No-Action Letter (NAL) to the online lender, Upstart. As a condition for receiving its NAL, the lender agreed to compare outcomes from its existing underwriting and pricing model (employing machine learning) against outcomes from a hypothetical model that uses traditional application and credit file variables.

    In its blog at www.consumerfinance.gov/about-us/blog/update-credit-access-and-no-action-letter/, the CFPB reports:

    “The results provided from the access-to-credit comparisons show that the existing (tested model) approves 27 percent more applicants than the traditional model, and yields 16 percent lower average APRs for approved loans. This reported expansion of credit access reflected in the results provided, occurs across all tested race, ethnicity, and sex segments resulting in the tested model increasing acceptance rates by 23–29 percent and decreasing average APRs by 15–17 percent.”

    Additionally, in many consumer segments, the tested model significantly expands access to credit compared to the traditional model. The tested model reflects that:

    “Near prime” consumers with FICO scores from 620 to 660 are approved approximately twice as frequently.

    Applicants under 25 years of age are 32 percent more likely to be approved.

    Consumers with incomes under $50,000 are 13 percent more likely to be approved.

    With regard to fair lending testing, the CFPB further noted that the results showed no disparities that required further fair lending analysis.

    Regulation?

    How these technologies should be regulated is a big topic of conversation these days, as well. While it is too early to determine exactly what actions lawmakers or regulators might take, given the mix of social ramifications, it must be determined what practices should not be legal, and how this should be regulated. In June 2019, the newly minted House Financial Services Committee Task Force on Artificial Intelligence held its first hearing. In July, its Task Force on Financial Technology held a hearing on the subject of alternative data and credit scoring (financialservices.house.gov/calendar/eventsingle.aspx?EventID=403824#Wbcast03222017, www.congress.gov/event/116th-congress/house-event/109867).

    In August, FDIC Chairman Jelena McWilliams also expressed that guidance on how institutions can use machine learning and AI is critical (www.americanbanker.com/news/regulators-must-issue-ai-guidance-or-fdic-will-mcwilliams). The lending laws in the U.S. have to keep pace with technology, which isn’t an easy feat as technologies are quickly being adopted by financial institutions and are continually evolving.

    In the future, banks will be adopting more AI, and there is regulatory guidance on the horizon. However, until then, how does a bank manage the fair lending risk it may already have with algorithms and third-party vendors?

    Steps to Managing Fair Lending Risk

    Managing fair lending risk may need to become a multi-disciplinary team sport that involves members from different risk and compliance teams working collaboratively—specifically the credit, third party, and model risk management teams.

    While fair lending teams may have statistical expertise, model risk teams may be able to provide guidance in reviewing the model design and development process. This could include checking for errors in the data, theory, analysis, assumptions, or code underlying a model. This early review in the model development or due diligence processes when selecting a third-party vendor, could prove to be invaluable. Prior to causing any consumer harm, flaws and potential issues can be identified and removed. Whether you have other risk teams that you can leverage, or if the responsibility for validation and analytics already lies with another team, the five general areas that should be understood in order to manage and mitigate risk are:

    Identification;

    Development and Use Cases;

    Inputs;

    Outputs; and

    Program and Controls.

    Identification

    The first challenge is identifying that an AI-based model is being utilized for a credit-related function, such as the marketing, underwriting, or pricing of credit products. This may be easier if the models are developed in-house by the bank, however, many times the functionality comes via tools or information from third-party vendors.

    While compliance management is often involved in a bank’s vendor risk management efforts, fair lending compliance experts may not be. If the fair lending risk team does not have a seat at the table during vendor due diligence, it is possible that a model’s potential impact to fair lending risk and compliance may be underestimated or missed entirely. For those who do not have an active role in the due diligence process, it is important to make sure that the compliance representative or someone within third-party risk management is able to determine situations that warrant compliance team involvement.

    It should be further noted that the due diligence process, when it comes to ML and big data, ideally includes a fair lending professional working closely with the developers. This helps ensure that there are fixed boundaries in the algorithm so that it doesn’t make decisions on its own, based on factors that could be problematic.

    Development & Use Cases

    Fully understanding how a model will be used is essential to the development (and subsequent analysis) process. Different types of uses carry different levels and types of inherent risk, which may determine the types of controls needed and the frequency and types of analysis or monitoring. For example, use of a model in product marketing raises the risk that certain minorities or low-and-moderate income groups or geographies could be excluded, and increases the potential risk for redlining.

    Even models with uses that appear to be low risk, or those that appear in tools that are not directly tied to lending, could impact fair lending risk. Consider the potential for models in marketing deposit products, or in sales tools that recommend products. Fair lending risk may increase if existing deposit clients are mined for lending solicitations, or if those product recommendations in a sales tool begin to cause steering concerns. Understanding these uses is also important in comprehending the reasonableness and validity of the model inputs.

    Inputs

    The data used in the AI model is crucial to the appropriateness of that model. In fact, the data set is often the first place that bias is introduced into a model. Underwriting or pricing model inputs need to be closely analyzed and thoroughly understood. The questions to ask are three-fold:

    Is the data accurate?

    Does the data point have a nexus with creditworthiness? (Some data are logical extensions of current underwriting practices, and others are not.)

    Is the data point potentially a proxy for a prohibited basis? (www.frbsf.org/banking/files/Fintech-Lending-Fair-Lending-and-UDAp-Risks.pdf)

    Additionally, the data utilized in creating underwriting and pricing models is often lending data from a prior period. While this data may seem valuable in determining factors that were indicative of borrowers who wound up being “good” or “poor” credit risks, it is important to recognize that past data may not be representative of the demographics of the entire population.

    The use of past lending data in this manner could serve to further existing biases in the data set, especially considering the population of who the CFPB refers to as the “credit invisible.” In its report, Data Point: Credit Invisibles, the CFPB indicated that approximately 11 percent of the adult population of the United States is credit invisible, i.e., they are without a Nationwide Credit Reporting Agency (NCRA) credit record (files.consumerfinance.gov/f/201505_cfpb_data-point-credit-invisibles.pdf). The numbers of credit invisible adults, however, are not distributed equally across all populations. Blacks and Hispanics are more likely to be credit invisible at 15 percent of those populations, compared with only nine percent of Whites or Asians.

    Because these credit invisible individuals do not have an NCRA credit record, they are less likely to be extended credit. This also makes it likely that they appear at a lower rate within the past lending data at most banks, which has the potential to skew data sets. And if your inputs are skewed, it stands to reason that your outputs will also be skewed—and that tends to be where we see true customer impact.

    Concern regarding the potential customer impact was noted by the House Financial Services Committee Task Force on Artificial Intelligence. The Financial Services Committee majority staff memo from the June 26, 2019 committee meeting indicates “AI could lead to biased lending decisions without lender oversight and if true, it would mean regulators may want to take steps to ensure algorithms used for compliance or predictive purposes are not subject to such bias” (financialservices.house.gov/uploadedfiles/hhrg-116-ba00-20190626-sd002_-_memo.pdf).

    Outputs

    Outputs, or results, may be the first place you can observe bias. This may come down to whether it can be explained and replicated. If a model is being used for marketing products, for example, what does the output look like? Is it reflective of the demographics of the geographic area? If it is not, there could be something in the model that is exclusionary—and your bank should be able to explain those discrepancies and understand how to resolve them, if necessary. To do this, it’s important for those who understand fair lending and the goal of the process also to understand how the algorithm works and learns.

    In the past, AI programs were task oriented and could analyze the same task over and over, and have the same result. Parameters would be set that were either “good’ or “bad.” Now, AI programs have also evolved and they can be given parameters that are “good” or “bad” and the program figures out different degrees of good or bad. For example, with the ML called Q-learning, the algorithm constantly tries to achieve the optimal state. It gets rewarded when it takes an action with a set of data, so it only looks for that data that would give it the reward and optimize its performance.

    For instance, in a race track game, it is told wrecking is bad and hitting all the check points is good. So the program figured that going slow was not optimal and it learned that speeding was the fastest way to hit all the check points. It’s like a human who learns from experience. It’s even similar to the mind of an addict; it keeps doing what gives it the highest reward to achieve the optimal state.

    Again, you wouldn’t want protected populations excluded if the algorithm taught itself that other populations were more favorable for some reason. Because ML models learn and change, introducing new factors for consideration, there should to be tighter controls programmed on the front-end. In addition there should be human analysis of the outputs to ensure fair, rather than optimal results.

    Program & Controls

    At the heart of any risk management program there must be a well-documented program framework. As with numerous other facets of compliance, a robust risk assessment is a necessary starting point. As a best practice, a bank should consider adopting a process to assess whether AI or new technology introduces unacceptable fair lending risk, perhaps as part of a vendor management or new product review.

    Strong, well-documented controls and ongoing monitoring and analysis are critical. All models should have periodic revalidation to ensure that it is still working as expected and that outputs are still valid. The level of risk will determine the frequency and intensity of revalidation efforts and any other monitoring and analysis efforts.

    Regression analysis can help determine the impact of each variable on the credit decision, probably of approval, and the pricing of loans. Those fields should then be reviewed to determine if they are reflective of creditworthiness and if they could be considered a proxy for any prohibited basis characteristics. In some instances, such as when dealing with a third-party proprietary tool, a bank may not be aware of each specific data point being utilized in a model. In those instances, if your bank is willing to accept the risk of a more opaque or black box type of model, results monitoring is likely to be your first and only indication of potential issues.

    Analysis of results must be frequent to verify that they are still in line with expectations. For example, a sudden dip in approvals (that are statistically relevant) to borrowers of a prohibited basis, or disparities in average prices may be red flags. Or, if market segments such as borrowers in low-and moderate-income or majority minority tracts are suddenly not being included within the results of marketing efforts, that might also be a performance issue.

    Final Thoughts

    The use of alternative data for lending continues to grow. And Congress, regulatory agencies, community organizations and civil rights advocates will continue to focus on whether use of these technologies adversely impacts fair lending compliance. Banks must be prepared to assess and mitigate both new fair lending risks and existing risks that may evolve in new ways within the rapidly changing environment.

    Fair lending compliance will need to involve a partnership between human and machine, where humans can validate the appropriateness of machine results in an attempt to increase the availability of credit. Although not without its risks, AI can create opportunities for a new set of borrowers.


    ABOUT THE AUTHOR:

    Britt Faircloth, CRCM, Senior Regulatory Consultant for Wolters Kluwer U.S. Advisory Services, focuses on CRA, HMDA, fair lending and redlining data analytics for institutions of all sizes. This includes CRA and fair lending market analysis, fair lending risk reviews, and integrated redlining reviews. In this role, Faircloth brings over 20 years of relevant banking and regulatory compliance experience to assist institutions in performing fair lending risk assessments, UDAAP risk assessments, CRA self-assessments, compliance management system (CMS) reviews, complaint management program reviews, third party vendor program reviews, and other types of quantitative and qualitative data analytics. Most recently, she has helped clients understand the challenges of the new Dodd-Frank HMDA data requirements, providing training, analysis and guidance for improving data integrity.

    Prior to joining Wolters Kluwer, Faircloth held senior-level compliance roles with financial institutions of varying sizes for 10 years and spent an additional eight years in operational positions. With her varied career, she often held positions with concentrations in fair lending, CRA, and HMDA, and she attended the American Bankers Association Compliance School. Reach her at Britt.Faircloth@wolterskluwer.com.



  • Please take a moment and tell us what you think of our content.