Machine learning in Governments: Benefits, Challenges and Future Directions

: The unprecedented increase in computing power and data availability has significantly altered the way and the scope that organizations make decisions relying on technologies. There is a conspicuous trend that organizations are seeking the use of frontier technologies with the purpose of helping the delivery of services and making day-to-day operational decisions. Machine learning (ML) is the fastest growing and at the same time, the most debated and controversial of these technologies. Although there is a great deal of research in the literature related to machine learning applications, most of them focus on the technical aspects or private sector use. The governmental machine learning applications suffer the lack of theoretical and empirical studies and unclear governance framework. This paper reviews the literature on the use of machine learning by government, aiming to identify the benefits and challenges of wider adoption of machine learning applications in the public sector and to propose the directions for future research.


Introduction
Machine learning (ML) algorithms learn from the data and concentrate rules from the data, then apply the learned insight to assess new data, allowing their gained insights to solve many distinct problems ranging from classification and regression to clustering. Nowadays, ML algorithms are the basis for granting loans, online product recommendations, and social media friend suggestions (Adadi and Berrada, 2018). With its unparalleled computational capability to extract information from high-dimensional data as well as unstructured data, ML could help unlock the value of the data, free up high-value work, improve services to citizen queries, enhance predictive capability for decision-making, and thus fuel innovative public services (Eggers et al., 2017). Many scholars have observed the nascent status of governmental ML applications (Sun and Medaglia, 2019) and the scarcity of related research (Gomes de Sousa et al., 2019;Zuiderwijk et al., 2021). Most AI research has focused on technical issues or private sector use, leaving only 3.5% of the nearly 1700 investigated literature focused on the use of AI in the public sector (Gomes de Sousa et al., 2019). Zuidermijk et al.(2019) suggested three reasons to explain such a knowledge gap, namely, less AI expertise within the public sector compared to the private sector, a lower rate of leadership development in AI governance compared to the penetration of AI applications into global governments, and a lack of focus on the specific problems of AI use for public governance compared to technological problems.
Recognizing this knowledge gap, this paper focuses on the discussion of opportunities, challenges, and future directions in the area of algorithmic engagement in the public sector. It strives to serve as a brief primer based on a literature review to answer why ML should be used in public policy and what challenges it may pose to policy makers as well as initiating discussion on further improving the use of ML by government. This paper strives to offer an exhaustive examination in leading ML&AI as well as Public Administration journals (see Table 1), using the keywords: 'machine learning public policy', 'machine learning government', 'machine learning public sector', 'machine learning social good', 'machine learning review'. It synthesizes research and examples in action as its theoretical foundation to illustrate current and potential ML utilization in the public sector, allowing an examination of the central topics as well as a proposal for future research directions. It enables public officials to take the first steps in exploring and becoming familiar with ML. It also helps ML experts to understand the specific needs of the public sector to re-examine and improve existing ML applications. It is structured as follows: it first identifies the sector-specific benefits for public policy engagement on ML, and the second section focuses on the examination of the potential harm and challenges brought by the adoption of ML. Concluding policy advice and research recommendations to maximize its potential and overcome its dangers are given in the last part.

Why the use of machine learning in public policy?
The potential of machine learning in the public sphere is grounded in the tremendous data availability and policy prediction needs of this field. Machine learning has already demonstrated considerable potential to enhance the effectiveness and accuracy of many decisions-making scenarios ranging from medical diagnosis, granting mortgages, tax evasion, and terrorist activities identification (Kononenko, 2001;Nowshath et al., 2019;Rodríguez et al., 2019;Mantari et al., 2020). Machine Learning is an umbrella term encompassing a wide range of algorithms for fields such as natural language processing, data mining, image processing, and predictive analytics. It is also often referred to simply as algorithms and used interchangeably with Artificial Intelligence and automated decision-making (European Commission, 2018). This paper adopts this definition and uses the terms machine learning (ML), artificial intelligence (AI), algorithmic systems, and automated decision-making, interchangeably, ignoring their many other non-overlapping subfields. As a study that enables computers to automatically learn from experience instead of relying on manually, explicitly programmed rules, and generalize the acquired knowledge to new settings (UNECE Machine Learning Team, 2018), ML can automate repetitive tasks, handle the analysis of large datasets, and provide predictive information in the public sector.
First of all, ML algorithms have the unparalleled computational capability to extract information from high-dimensional data as well as unstructured data, such as texts, photos, videos, and blogs. In the digital age, governments have the access to a vast quantity of data collected by public bodies for registration, transaction, and record-keeping. These raw data, however, often do not occur in an understandable form for traditional statistics models because of the huge computation magnitude and unstructured format (Ubaldi, 2013). Therefore, the advancement of ML on data processing and analysis could help unlock the value of the data, automate the repetitive tasks, understand the citizens' needs, and thus fuel the innovative services. With the increasing development of communication, a tremendous amount of information is stored as digital text which can be used as an input to policy, social and economic study. The powerful combination of ML algorithms and digital text is now revolutionizing the way governments seek help from data. A natural language processing algorithm is developed by The United Nations Big Data program, Global Pulse in Indonesia. It can identify tweets that mention the prices of basic food items (beef, chicken, onions, and peppers), allowing the government to track food prices in real-time and have an early warning for unexpected price spikes (UN Global Pulse, 2014). ML algorithms can extract meaningful information from data that is more complex than text. A team at Stanford (Gebru et al, 2017) trained an ML model to extract features from the image of Google Street View in 200 US cities to estimate socioeconomic characteristics. This model is proved to have higher accuracy in predicting household income and thus can not only reduce the cost of labor-intensive door-to-door censure surveys but also help solve the lag of demographic changes between two surveys. Thanks to its powerful data processing capabilities, ML can help the public sector automate many highly labor-intensive data processing and analysis activities, thereby increasing the efficiency and speed of government services and actions.
Another merit of ML is its predictive power. A key challenge when developing policy is that the effects of a new policy are unknown until it is implemented. To address this problem, policy-makers resort to comparing similar policies abroad, performing policy trials, or developing statistical models that assist in predicting likely outcomes. Therefore, ML's predictive power allows the policy makers to anticipate a policy's impact before implementing, supporting the decisions of policy adoption. For example, after the outbreak of the COVID-19 pandemic, an algorithmic system was applied in Qatar to predict the impact of lockdown policy on COVID-19 cases assisting the formulation of social restriction policy (Said et al,2020). Additionally, ML predictions on social service demands can help optimize the allocation of limited social resources. An ML model of homeless family shelter entry and length of stay was developed at New York University to study the likelihood of re-entry and the probability of a homeless family becoming a long-term stayer. The results of the model help shelters understand the demand for the number of beds at any given time, so they can plan more efficient resource allocation based on predicted demand (Hong et al.,2018). Governance by such algorithmic prediction is increasingly interwoven into many scenarios of decision-making, such as predictive policing, smart city planning, and court adjudication (Abdul et al., 2018). Meijer et al.'s study (2019) demonstrates how the application of ML enables the computer to find the spatial pattern of prior criminal activities and automatically forecasts where crime might occur in the future in a considerably accurate manner. There also has been a surge of interest in predicting judicial decisions with ML algorithms. With natural language processing tools and predictive algorithms, Medvedeva's team (2020) learned from the proceeding court reports and automatically predicted future decisions with an average accuracy of 75%, which offers a reliable reference for the verdict.
Other illustrative examples include: (i) tailoring the tax rebate program for the households that are most likely to be consumption constrained (Andini et al., 2018); (ii) improving social welfare by hiring the police who will not use excessive force and promoting teachers that will bring the largest added value (Chalfin et al., 2016); (iii) foreseeing unemployment spell length to assist laborers in savings rates and job search strategies (Kleinberg et al., 2015).

Challenges posed by the use of machine learning
Because of the exponential increase in algorithm-assisted decisions within the public policy that can widely affect individuals' rights, interests, and expectations, ML is no longer just about the technical domain, but also has tremendous potential impacts on every aspect of our society. ML can generate numerous benefits in the public sphere. Unfortunately, based on the literature review, there is ample evidence that many concerns around it, such as explainability, fairness, and institutional challenges (Gunning, 2019;Yu et al., 2018;Guenduez et al., 2020), are still unresolved issues perpetuating, and even exacerbating the existing problems. Despite this, the technology community has launched a discussion focusing on the seriousness of the above concerns, the main thrust of research has been either focused on a technical perspective or the applications in the private sector. In light of differences in ownership, administrative culture, and relative reliance on political control versus market forces between the two sectors (Perry and Rainey, 1988), more involvement of public policy perspective to understand and address these issues has become a pressing research agenda.

Fairness, Bias, and Discrimination
Although ML can generate numerous benefits in the public sphere, there is ample evidence that intentional or unintentional discrimination against certain groups and individuals is still an unresolved issue. For instance, Amazon created an AI recruiting tool to rank the candidates and make hiring decisions. However, for technical jobs such as software engineers, the tool learnt to automatically ignore women's CVs, eliminating women's chance to get such jobs (Kodiyan, 2019). However, bias does not arise as a result of algorithm design in most cases; rather, algorithms inherit existing bias from historical data that contains remnants of bias from human decision-making and culture. In this case, the root of this kind of prejudice lies in the way traditional gender ideology and latent discrimination are captured in the data from which the algorithm learns (Leavy et al., 2020). The significant component of data that was utilized for training the AI recruiting system was the resumes of employees in the company, mostly males. This gender inequality embedded as data imbalance is the natural reflection of existing male dominance in the workplace. This trend was studied by the algorithm, and thus continually sustained by it, meaning that algorithms tend to maintain the status quo instead of making progress without human intervention.
Predictive Policing is another area that has raised concerns regarding racial bias. Although the use of ML technology to predict future crime participants and crime locations has doubled the accuracy of crime prediction over its current practice (Zach, 2013), it has been instrumental in leading to discrimination against a particular racial group. For example, ML predictive policing systems may inappropriately associate darker skin with greater criminal suspicion or lead to more arrests for minor crimes in communities of color (Selbst, 2017). As shown in these real-world incidents, datadriven decision-making is not free from existing, real-world biases. Since the public decisions made by AI will have a large-scale and profound impact on many aspects of society and individuals, they must be as fair as possible. If not developed with awareness, algorithms can inherit or even exacerbate historical inequities underlying the input data. Many discussions of the benefits of ML use in the public sector draw on an argument that 'the more data governments and public institutions manage to integrate into their systems, the higher the capabilities of machine learning to make decisions based on this data will be' (Cary &David, 2017).
Thus putting algorithm design and the quality of input data under scrutiny, to prevent unfair algorithmic decisions is an essential step for integrating AI as a part of service support in the public sphere.

Explainability, Transparency, and Accountability
Another relevant aspect of fairness is to provide reasoning for the decisions made by algorithms. In the public sector, where decision-makers are hierarchically and democratically accountable, and where transparency is of crucial importance, explanations of the decisions leading to a certain policy are particularly important. It is the right of citizens to access information about the procedures and data which lead to certain decisions affecting them (Scantamburlo, 2019). Transparency of the decision-making process is the mechanism that facilitates accountability (Diakopoulos, 2016), allowing the tracking of the entire procedure, as well as the detection of responsibilities when some failures occur. However, because of the lack of simplicity and observable results, it is challenging for decision-makers to ensure explainability, transparency, and accountability when it comes to machine decisions. For instance, teachers from Houston prevailed in a lawsuit with their schools over an algorithmic system that evaluated their performance (Webb, 2017). Those who received great evaluation won praise, while those who received a bad rating, risked termination. Some teachers believed that the system penalized them unfairly, but they had no way of knowing for sure, since the firm that developed the software, the SAS Institute, considered its algorithm a trade secret and refused to reveal its workings. A federal judge determined that the program had violated the teachers' civil rights when the teachers brought their case to court. A teacher has the right to know the reason behind the decisions that affect his/her career, even if they are made by a computer. This example is a good illustration of the problems faced by using ML in the public sector, where users have no way of knowing for what reason the algorithm made a certain decision and who to blame when the machine makes undesirable decisions. There is a requirement for explainability and transparency of the algorithms to ensure their accountability.
As ML is applied in many sensitive areas in policy making, the need for explainability and transparency of the decision-making process and the clear accountability of the decision in the public sector, becomes even more important. As Rudin believes (2019) explainability and interpretability are necessarily defined in a domain-specific way, these concepts in the context of the public sector have not reached consensus, due to the current nascent status of investigations. Since policy makers rarely have a background in ML and they require explanations for different purposes than model developers do, this paper defines explainability as the ability of algorithmic systems to present understandable post-hoc explanations in various ways such as natural language explanations, visual explanations, and explanations by example, for the causes of their decisions. Once the reasoning behind the machine decisions is revealed to policy makers, it is up to them to decide whether or not to adopt these decisions leaving them to hold accountability for the behavior and the potential impacts of machine decisions.
However, it is not an easy task to make ML models explainable and transparent. There tends to be a trade-off between explainability and predictive accuracy meaning a trade-off between models that are easier to interpret and complex models that provide more accurate predictions. The users are often unable to describe in detail how the decision comes about, or on what aspects of the data the decision is based (Adadi and Berrada, 2018), because the algorithms with higher prediction accuracy, usually take the unexplainable form to understand and discover the subtle correlation between the input variables. The opaque nature of algorithms makes communication between machine learning experts, policy makers, and other domain specialists difficult (Letham et al., 2015). Improving the explainability of machine learning will provide more transparency into how decisions are reached, enabling decision-makers and citizens to track and understand the whole process of decision making, and eventually upgrade the accountability of the algorithmic decision-making system.

Institutional Challenge
Researchers have also begun to investigate the institutional reasons behind the challenge of incorporating technology-centric solutions into decision-making in the public sector. A decisionmaking process through which a data-driven approach becomes fully integrated into the organizational culture is crucial to the success of algorithmic support provision. Although most high-level national AI strategies show a welcoming attitude towards their participation in this cutting-edge technology, the lived experience of the bureaucrats at the micro-level are usually in an odd direction because the use of AI challenges the traditionally bureaucratic form of the public sector (Bullock, 2019). The case studies show that innovation adoption in government often takes place within existing systems and utilizes existing tools, leading to bottlenecks and undermining the organization's ability to effectively manage innovation engagement (UNITED NATIONS, 2020), such as ML. Exacerbating this, given the limited infrastructures and the employee resistance to using technology innovation, the public sector is also heavily resource-constrained and path-dependent (Veale et al., 2018). Guenduez's study (2020), also revealed a widespread skepticism of technology innovation among public managers, applying big data as an example.
The use of big data is increasingly studied in the context of policy process or "policy cycle" (Höchtl et al., 2016). In contrast, only a few studies have investigated the opportunities that ML, the working horse in the era of Big Data, can bring to the policy-making process. Most studies in the field of digital transformation in public policy do not pay enough attention to the specific effects of ML on public services. However, this is of particular importance in the context of the current crisis of COVID-19, where algorithmic tools have been developed to put into use in the frontline to keep the economy working and provide a health service. As new situations and a wider range of AI applications emerge, AI and the underlying regulatory mechanism for data ecosystems have become crucial policy concerns (Jordan, 2020). A need to identify the gap between the ML literature and understanding, expectations, and needs of policy makers is justified by the wide-ranging impact caused by the application of ML.

Moving towards better use of machine learning
The challenges described above are interlinked with each other and eventually lead to the unreliable and undesirable application of algorithms in the public sector. For instance, researchers found that the PredPol algorithm, a widely used predictive policing platform with the capability to help the policing resource allocation with its future crime risk prediction, improperly link dark skin to greater suspicion of having committed crimes or lead to more arrests for nuisance crime in the colored neighborhoods (Selbst, 2017). Due to the complexity of PredPol's algorithm, it is not an easy task to understand how the decision is made and detect its bias. If police officers cannot comprehend the prediction from the software, they may not take any response to it, hampering the effectiveness of the predictive analytic tools. This might explain the outcome of a recent survey of 70 police agencies where, while 70 percent have predictive tools, only 22 percent were actually using them to help make decisions (Perry et al., 2013). To encourage AI to take a larger role in the policy-making process, coming up with solutions to address these issues is fundamental. Here I present three possible directions that future research can explore to ensure an understandable, accountable and prevalent use of ML in the field of public policy.

User-centered Explainable AI
Many benefits around ML have yet to be realized within public sectors, due to its opaque nature and the requirements of governments' accountability. To address the opacity, explainable artificial intelligence (XAI) was initiated to "enable human users to understand, appropriately trust, and effectively manage the emerging generation of artificially intelligent partners" (Gunning, 2017). XAI is ultimately a human-computer interaction (HCI) problem, a multidisciplinary field on the intersection of social science and artificial intelligence, shifting towards more interpretable AI. Interpreting the reasons behind the predictions can transform an untrustworthy model or prediction into a trustworthy one (Ribeiro, 2016). Friedler proposed two forms of interpretability: global interpretability means understanding the whole of a trained model; local interpretability means understanding the effects of a trained model on a particular input and its corresponding outcome (Friedler et al. 2019). A more relevant way to classify explanations is to divide the users of explanations based on their level of knowledge in AI: technical and non-technical users (Wanner et al., 2020).
However, existing XAI focuses primarily on providing technical interpretations for technical users to debug or improve the performance of algorithms, while non-technical end users, the largest group of XAI users who will be affected by machine decisions or entitled to grant system adoption, are largely ignored. Although the booming research on XAI, the current research tends to take an algorithm-centric view, disregarding the specific needs of real-world users (Liao et al. 2020). There is an increasing awareness that different users will have different requirements in terms of explanation of the prediction. An analyst might be particularly interested in the internal workings of the estimator, while a decision-maker might be more interested in the key facts or data points that lead to the prediction (Sørmo et al. 2005). As a result, based on the distinct level of expertise, need and expectation, appropriate and differentiated explanations should be provided to different user groups.
Although recent research has shifted its attention slightly to the end users of the model explanations (Ehsan et al., 2021), most XAI researchers do not pay enough attention to different audiences so that the explanations are targeted (Ribera, 2019). Furthermore, a large part of this literature is about explaining the inner workings of the algorithms, instead of the justification of the results (Sørmo et al., 2005, Adadi andBerrada, 2018). In the public sector, where decision-makers are hierarchically and democratically accountable, the justification for the policy with its underlying evidence is particularly important. Therefore, the need is urgent, and the time is ripe for the interdisciplinary collaborations of machine learning and social science community to answer the question of what type of algorithmic interpretation is most needed by policy makers and how this need can be met from a technical and practical use perspective. I believe that a usercentered XAI research agenda is becoming more necessary today when algorithms are involved in many high-stake decision-making processes.
Offering tailored explanations based on the characteristics and expectations of the targeted group, user-centered XAI could provide a certain level of transparency to ease the detection of model bias, unveiling hidden correlations between the input data amenable to lead to discriminatory solutions (Ahn, 2019). Besides, it promotes trust and social acceptance for not only the policy design made by the algorithms but also the policy itself, because it enhances the citizens' involvement in the process of service design, helps the communication to the groups influenced by the decisions, and thus raises citizen's awareness and trust of the policy decisions. In the context of AI engagement in the public sector, the development of XAI needs to be supported by multidisciplinary collaboration to ensure that it meets the needs and expectations of public officials. Therefore, future research should shift its focus to a user-centered XAI approach that has the ability to meet different stakeholders' distinct needs and expectations to provide people-centered services with an algorithm-involved approach, in the domain of public policy.

Algorithmic Impact Assessment
Due to the nature of public sector responsibility and the unresolved concerns around the use of ML, the poorly designed, unregulated algorithms have potentially far-reaching, adverse impacts, often involving the most vulnerable members of society. Public authorities urgently need a practical framework to assess algorithmic systems (European Parliament, 2019). The idea of implementing impact assessments of algorithm to ensure its accountability is gaining momentum (Moss et al, 2020). For instance, The Government of Canada requires a questionnaire-style impact analysis to ensure that its agencies are using ML algorithms in a manner that is compatible with core administrative law principles such as transparency, accountability, legality, and procedural fairness (Kuziemski, Creative Commons Attribution 3.0 Austria (CC BY 3.0), 2021. 2020). The US Congress proposed The Algorithmic Accountability Act of 2019, requiring companies to perform impact assessments of automated decision systems and evaluate their impact on accuracy, fairness, bias, discrimination, privacy, and security (Booker and Wyden, 2019).
To anticipate, avoid, and mitigate the negative consequences of algorithmic decision-making, many researchers have proposed a framework named Algorithmic Impact Assessment (AIA) as a practical practice for algorithmic accountability (Reisman et al., 2018). The AI Now Institute at New York University outlined four initial goals of AIA, namely providing the public with information about the systems that decide their fate, granting the meaningful access for external researchers to review and audit systems, developing the expertise of public agencies to assess the performance and impact of automated decision systems, and strengthening due process by offering the public the opportunity to engage with the AIA process before, during, and after the assessment (Reisman et al., 2018). Although the research and practice of AIA is still in its infancy, and issues such as the scope and structure of the assessment, when to conduct the assessment, what impacts count as impacts, and who should conduct the assessment are still open to discussion, some basic consensus has been reached on the form of AIA. First, the AIA requires a self-assessment of algorithmic systems within public agencies to assess potential impacts on fairness, justice, bias, and other harms and to make mitigation plans. To complement the insufficient internal self-assessment, AIA also has an emphasis on the establishment of regularly conducted external researcher review processes. Secondly, AIA emphasizes a combined participation of public authorities, domain experts, and the public. Algorithmic decision-making processes can be extremely intricate, and challenges such as bias and systematic error cannot be easily identified by evaluating systems on a case-by-case basis. Therefore, external expert analysis is recommended to be at both a group-level and interdisciplinary on an ongoing basis. The framework also recommends a public disclosure of the purpose, scope, intended use, self-assessment process of the algorithmic system, along with the associated policies at the start of the assessment process, to collect early external feedback and adjust the assessment for the most pertinent public concerns (Koene et al, 2019).
Through agency self-assessment, public disclosure of system adoption, and plans for meaningful access for external researchers and experts, AIA offers agencies a framework to evaluate the automated decision systems they adopt and to provide the public with greater insights into the workings of the systems, making governments ready to face the risks presented by them. Further research and practice is needed to formulate a standard format and application of AIA to ensure that AI moves in a direction beneficial to society.

Transformation Towards AI-powered culture
The pandemic has shown the critical role that AI plays in facilitating rapid policy decisions based on real-time data analysis and prediction. However, as explained above, the public sector still shows hesitance or lacks the capacity to embrace the arrived AI era. The United Nations has suggested nine key pillars for the AI transformation (UNITED NATIONS, 2020), prioritizing the changes in the government organizational culture, the implementation of the new regulatory framework, and the development of new individual capacities. The easiest start will be the training of the public servants, from the top policy-makers down, to facilitate the implementation of AI. Developing such training projects can improve public officials' knowledge, skills, and attitudes related to ML. As for the specific format of the training, Fountaine suggests the introduction of internal AI academies, which usually include classroom work (online or inperson), workshops, on-the-job training, or onsite visits to experienced commercial companies (Fountaine, 2019).

Conclusion
In terms of AI application, public sectors hold a dual role: as AI regulators, governments are obligated to shield citizens from the detrimental impact of the algorithms, while as AI users, governments are facing the pressure to respond to the demands of a rapidly-evolving society, boosting its efficiency with the help of ML (Kuziemski et al, 2021). Therefore, governments should take a distinct approach to respond to the new development of AI technology and social requirements. In this article, I first identified the benefits of ML use in government including automating complex data analysis, innovating public services, supporting policy making with impact prediction, and optimizing resources allocation, due to ML's data processing capabilities and predictive power. This section provided a justification for the pressing need for policy makers to understand, examine and embed an ML-powered approach to delivering services and making decisions, especially in the light of the COVID-19 pandemic. Then I raised my concern about the technical, legal, ethical, and organizational barriers and risks that impede a wider adoption of ML in the public sector. In view of the discussion in this paper, it is suggested that future research directions include: • User-centered Explainable AI; • Algorithmic Impact Assessment; and • Transformation Towards AI-powered culture.
This article comes with some limitations because, as a new field that needs more discussion and research, the application of ML in the public domain is still in its infancy. While this paper intersperses many real-world examples, most ML projects in the public domain are still in the pilot or early implementation phase, and there is a lack of valid results to evaluate the use of ML for public policy. The scope of the reviewed papers may not show the full dimension of current government ML use and discussion, as some papers may have been excluded from the examined range. The analysis and synthesis of the literature discussed in this paper may run the risk of providing only hypothetical ideas about the benefits, challenges, and solutions of using ML in the public sector, as the literature analyzed is often normative and exploratory, lacking empirical support and only focusing on assumptions and expectations (Studinka et al., 2018). For example, the task of addressing explainability challenges in the context of ML engagement in the public sector, may not be an easy task from a technical perspective. It may even conflict with privacy protection in some cases, as people involved in training ML models, refuse their data or inferences about their data to be exposed.
Many important questions remain unanswered, such as the specific definitions and needs of the public sector regarding ML explainability, transparency, and accountability, and the framework for protecting government and citizens from the possible adverse consequences of ML. ML use in the public sector is by nature an interdisciplinary issue that requires the cooperation from AI community, public policy community, and other related communities. This paper calls for more practices and research through a multidisciplinary approach, to further understand the specific benefits, challenges, and solutions of ML applications technically, socially, legally, and politically. More specific case studies should be expected in different policy areas, in different regions, and at specific government levels, to enable the analysis and comparison of the specific benefits, challenges, and solutions. Future research should also aim to develop a practical framework to guide and govern the incorporation of ML innovations and mitigate possible risks through case studies, pilot implementations, or large-scale implementations. Further domain-specific, interdisciplinary research is necessary for ML algorithms to accomplish their full potential in the public policy field. This paper is an introduction to ML in a public policy context that calls for a more specific, systematic, and interdisciplinary investigation of the complex government AI use and regulation.