How SMB Leaders Can Prepare for the AI in HR Revolution

Share This Post

The Artificial Intelligence (AI) revolution is transforming business operations, including human resources (HR), making it a necessity for small to medium-sized businesses (SMBs). This article guides business leaders through the impact, benefits, preparation steps, and challenges of implementing AI tools in HR practices.

What is artificial intelligence?

Artificial intelligence involves the development of computer systems capable of performing tasks that typically require human intelligence. These tasks include visual perception, speech recognition, decision-making, and language translation.

In the context of HR, AI encompasses the use of machines to automate repetitive tasks, analyse large volumes of data, and provide insights for better decision-making.

Future trends in AI and HR

As technology continues to evolve, it is important for SMB leaders to stay abreast of the latest trends in AI and HR so that they can be ready when such tools find their way into the workplace.

Predictive analytics

Predictive analytics involves using historical data to predict future outcomes. In the context of HR, predictive analytics can be used to predict employee turnover, performance, and engagement levels. Having this information could help HR teams to take proactive measures to address any potential issues before they become major problems.

Natural language processing (NLP)

Natural Language Processing (NLP) is a branch of AI that focuses on the interaction between computers and human language. In the context of HR, NLP can be used to analyse employee feedback, conduct sentiment analysis, and provide insights into employee engagement and satisfaction levels.

Chatbots and virtual assistants

Chatbots and virtual assistants are AI-powered tools that can interact with employees in real-time and provide instant responses to their queries. These tools can be used to automate repetitive tasks such as answering frequently asked questions, scheduling meetings, and providing information about company policies and procedures.

Key areas in HR where AI could make a difference


The recruitment process stands to be transformed by AI-enabled tools. These tools have the potential to automate various aspects of recruitment, from screening resumes and matching them with suitable job descriptions to scheduling interviews and providing candidates with real-time feedback.

Taken to the maximum degree, perhaps they could even predict a candidate’s job performance, aiding in the selection process. Importantly, AI tools can already assist in crafting job postings that are devoid of gender bias, thereby fostering diversity and inclusion in the workplace.

Employee engagement

Enhancing employee engagement is another area where AI-enabled tools can be particularly beneficial. Taking the human element out of the process, such tools could potentially analyse feedback from employees, monitor their engagement levels, and subsequently provide recommendations to improve engagement.

By identifying the reasons behind low employee retention rates, AI tools could then suggest actions for their human counterparts to address these issues.

Performance management

In performance management, AI can play a crucial role by helping set performance goals for employees, monitoring their progress towards these goals, and providing real-time feedback. These tools could even take the hassle and concern out of manager evaluations by pinpointing areas that require improvement for them.

Additionally, by identifying the strengths and weaknesses of any employee, AI tools could craft personalised development plans to help employees enhance their skills and ultimately benefit the entire organisation.

Learning and development

When it comes to strengthening learning and development programs, future AI tools could analyse an employee’s learning style, past performance, and career aspirations, and then recommend courses that are most relevant to the employee.

They could in theory also track the employee’s progress through these courses and provide feedback to help them improve. Moreover, when applied to the organisation’s data, such tools could identify skills gaps and suggest training programs or present hiring needs.

Employee wellness

Employee wellness is becoming a top priority for organisations, and AI can play a crucial role in this area as well. AI-powered tools could interface with various systems to monitor employee health and wellness in ways that humans just aren’t able to. Beyond that, they would also suggest interventions by HR team members or perhaps proactively offer virtual coaching and support.

What benefits can SMBs expect when adopting AI tools into their HR practices?

Cost efficiency

One of the most significant benefits that SMBs can expect when adopting AI tools into their HR practices is substantial cost savings. The automation of various HR tasks leads to a reduction in the administrative burden and, consequently, operational costs.

In optimizing employee engagement and performance management processes, AI will reduce the need for extensive manual intervention and, thereby, further reduce operational costs. By streamlining these processes, SMBs can allocate their resources more efficiently, ultimately leading to significant cost savings.

Time saving

By automating repetitive tasks, AI can allow the HR team to focus on more strategic activities. Automating the employee retention process, for example, enables the HR team to concentrate on developing strategies to enhance it. It also helps in reducing the response time to employee queries and concerns, improving employee satisfaction.

Improved employee experience

AI offers personalised experiences to employees, from recruitment to learning and development. It can recommend tailored learning paths based on an employee’s career aspirations and learning style. It can also help in creating personalised onboarding programs, ensuring a smooth transition for new hires.

Enhanced decision making

AI tools have the capability to analyse vast amounts of data, providing critical insights that can significantly improve decision-making within an organisation. One of the key areas where this is particularly beneficial is in addressing employee turnover. High employee turnover is a concern for many organisations, and understanding its causes is crucial for developing strategies to mitigate it.

Additionally, AI could also be used to help in identifying the top performers within the organisation. Imagine always knowing who the best employees are and being able to understand the traits that make them the most successful.


For small and medium-sized businesses, scalability is a critical consideration. As these businesses expand, their HR needs grow correspondingly, necessitating the implementation of solutions that can grow with the organisation. Implementing AI in HR facilitates the scaling of operations without a proportional increase in costs.

This is because AI tools can automate many of the repetitive and time-consuming tasks associated with HR, reducing the need for additional personnel.

Preparing to bring AI into your organisation

Assessing the current HR processes

The initial step in preparing for the AI revolution in HR is to evaluate the existing HR processes and determine where AI can add value. If the recruitment process is time-consuming and costly, for example, implementing AI in this area would be advantageous.

Choosing the right AI tools

Various AI tools are available in the market, each with its own features and capabilities. It is crucial to select the AI tools that align with the organisation’s needs and objectives. If the organisation aims to improve employee engagement levels, an AI system that can analyse natural language employee feedback and provide recommendations would be beneficial.

Training the HR team

The HR team must possess the necessary skills and knowledge to use AI effectively. Providing training to the HR team on leveraging AI-assisted systems and interpreting the results is essential. It is also important to address any concerns or reservations the HR team may have.

Implementing AI in phases

It is advisable to implement AI in HR in phases. Begin with one area, monitor the results, and then proceed to the next area. For instance, start with implementing AI in the recruitment process, monitor the outcomes, and then move on to employee engagement.

Monitoring and adjusting as needed

After implementing the AI tools, it is crucial to monitor their performance and adjust the implementation if necessary. If the AI system in the recruitment process does not yield the expected results, or doesn’t make the team more efficient, it may be necessary to adjust the implementation or select a different tool.

The importance of employee feedback when introducing AI systems

Gathering and analyzing employee feedback is also crucial to successfully adopting AI in the workplace. Employees are the end-users of these AI tools, and their feedback can provide valuable insights into their effectiveness, any potential issues, and areas for improvement.

Conducting surveys

Regular surveys can be conducted to gather feedback from employees on their experiences. These surveys can include questions on the usability of the AI, whether they’re realizing anticipated outcomes, and any challenges faced by the employees.

Focus groups

Organizing focus groups with employees can provide more in-depth insights. These focus groups can be organized with employees from different departments and levels of the organisation to get a comprehensive view.

Feedback analysis

The feedback gathered from the surveys and focus groups should be analyzed to identify any common themes, issues, or suggestions for improvement. This analysis can also be used to inform any necessary adjustments.

The role of leadership in AI implementation

For the successful implementation of AI in HR, the role of leadership is paramount. Here are some considerations for SMB leaders.

Vision setting

The first step towards successful implementation is for leaders to articulate a clear and compelling vision. This vision should encompass the objectives, expected benefits, and the role of AI in the broader organisational strategy. A well-articulated vision serves as a guiding light for the entire implementation process and helps align the efforts of all stakeholders.

Exemplary leadership

Leaders need to be the change they wish to see. By actively embracing and using AI tools in their work, leaders can inspire their teams to do the same. This not only encourages employees but also helps leaders identify potential issues and areas for improvement.

Support provision

Support from leadership is crucial during the implementation process. This involves allocating necessary resources, facilitating training, and addressing concerns or reservations. Adequate support ensures a smooth implementation process and helps overcome challenges.

Progress monitoring

Leaders should actively monitor the progress of AI implementation. This involves reviewing the AI tools’ performance, analysing employee feedback, and making necessary adjustments. Regular monitoring ensures the implementation stays on track to achieve its objectives.

What challenges should you expect when bringing AI into your workplace?

Privacy and security of data

The analysis of large volumes of employee data is a key component of AI implementation in HR. Ensuring secure storage and processing of this data while respecting employee privacy is crucial. Employees should opt in to have their data used in AI tools, and companies should be cautious about sharing proprietary data.

Managing change

Implementing AI in HR often requires changes to existing processes and workflows. A comprehensive change management plan, including communication, addressing concerns, and providing necessary training and support, is essential for a smooth transition.

Implementation costs

The costs associated with purchasing AI tools, training the HR team, and adjusting existing processes must be carefully considered. It is important to ensure that the investment is justified by the anticipated benefits.

Collaboration between AI and humans

While AI can automate various HR tasks, it is essential to ensure collaboration between AI and humans. For example, although AI can assist in screening resumes, the final decision on hiring should be made by a human.

Addressing bias

Unintended or unconscious bias can manifest in AI-enabled systems, leading to unfair decisions. Teams should be aware of this potential issue and address it by regularly reviewing and adjusting the AI algorithms.

Legal and ethical considerations

It is essential to consider the legal and ethical implications of implementing AI in HR. For example, the use of AI in recruitment must comply with equal employment opportunity laws and regulations.

Acceptance by employees

The success of AI implementation in HR largely depends on employee acceptance. Communicating the benefits of AI, addressing concerns, and providing training can help gain employee acceptance.

Continuous improvement

Implementing AI in HR is a continuous process that requires regular review of the AI tools’ performance and necessary adjustments to ensure their effectiveness.

In Summary

Embracing the AI revolution in HR is essential for SMBs striving to optimise operations and enhance the employee experience. While there are challenges to navigate and considerations to bear in mind, adequate preparation can empower SMB leaders to harness the full potential of AI in HR.

This involves not only selecting the right tools and training the team but also continuously monitoring and adjusting the implementation. Ultimately, this strategic approach will lead to refined HR operations, a more positive employee experience, and the achievement of key business objectives.

About the Author

Matthew Meadows is the CEO of WorkStory, a platform that helps employees grow without the need for time-consuming performance reviews. Through his work, Matt challenges traditional methodologies that no longer resonate in today’s dynamic workplace.

The post How SMB Leaders Can Prepare for the AI in HR Revolution appeared first on The 6Q Blog.

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore


Best of both worlds: Achieving scalability and quality in text clustering

Posted by Sara Ahmadian and Mehran Kazemi, Research Scientists, Google Research

Clustering is a fundamental, ubiquitous problem in data mining and unsupervised machine learning, where the goal is to group together similar items. The standard forms of clustering are metric clustering and graph clustering. In metric clustering, a given metric space defines distances between data points, which are grouped together based on their separation. In graph clustering, a given graph connects similar data points through edges, and the clustering process groups data points together based on the connections between them. Both clustering forms are particularly useful for large corpora where class labels can’t be defined. Examples of such corpora are the ever-growing digital text collections of various internet platforms, with applications including organizing and searching documents, identifying patterns in text, and recommending relevant documents to users (see more examples in the following posts: clustering related queries based on user intent and practical differentially private clustering).

The choice of text clustering method often presents a dilemma. One approach is to use embedding models, such as BERT or RoBERTa, to define a metric clustering problem. Another is to utilize cross-attention (CA) models, such as PaLM or GPT, to define a graph clustering problem. CA models can provide highly accurate similarity scores, but constructing the input graph may require a prohibitive quadratic number of inference calls to the model. On the other hand, a metric space can efficiently be defined by distances of embeddings produced by embedding models. However, these similarity distances are typically of substantial lower-quality compared to the similarity signals of CA models, and hence the produced clustering can be of much lower-quality.

An overview of the embedding-based and cross-attention–based similarity scoring functions and their scalability vs. quality dilemma.

Motivated by this, in “KwikBucks: Correlation Clustering with Cheap-Weak and Expensive-Strong Signals”, presented at ICLR 2023, we describe a novel clustering algorithm that effectively combines the scalability benefits from embedding models and the quality from CA models. This graph clustering algorithm has query access to both the CA model and the embedding model, however, we apply a budget on the number of queries made to the CA model. This algorithm uses the CA model to answer edge queries, and benefits from unlimited access to similarity scores from the embedding model. We describe how this proposed setting bridges algorithm design and practical considerations, and can be applied to other clustering problems with similar available scoring functions, such as clustering problems on images and media. We demonstrate how this algorithm yields high-quality clusters with almost a linear number of query calls to the CA model. We have also open-sourced the data used in our experiments.

The clustering algorithm

The KwikBucks algorithm is an extension of the well-known KwikCluster algorithm (Pivot algorithm). The high-level idea is to first select a set of documents (i.e., centers) with no similarity edge between them, and then form clusters around these centers. To obtain the quality from CA models and the runtime efficiency from embedding models, we introduce the novel combo similarity oracle mechanism. In this approach, we utilize the embedding model to guide the selection of queries to be sent to the CA model. When given a set of center documents and a target document, the combo similarity oracle mechanism outputs a center from the set that is similar to the target document, if present. The combo similarity oracle enables us to save on budget by limiting the number of query calls to the CA model when selecting centers and forming clusters. It does this by first ranking centers based on their embedding similarity to the target document, and then querying the CA model for the pair (i.e., target document and ranked center), as shown below.

A combo similarity oracle that for a set of documents and a target document, returns a similar document from the set, if present.

We then perform a post processing step to merge clusters if there is a strong connection between two of them, i.e., when the number of connecting edges is higher than the number of missing edges between two clusters. Additionally, we apply the following steps for further computational savings on queries made to the CA model, and to improve performance at runtime:

We leverage query-efficient correlation clustering to form a set of centers from a set of randomly selected documents instead of selecting these centers from all the documents (in the illustration below, the center nodes are red).

We apply the combo similarity oracle mechanism to perform the cluster assignment step in parallel for all non-center documents and leave documents with no similar center as singletons. In the illustration below, the assignments are depicted by blue arrows and initially two (non-center) nodes are left as singletons due to no assignment.

In the post-processing step, to ensure scalability, we use the embedding similarity scores to filter down the potential mergers (in the illustration below, the green dashed boundaries show these merged clusters).

Illustration of progress of the clustering algorithm on a given graph instance.


We evaluate the novel clustering algorithm on various datasets with different properties using different embedding-based and cross-attention–based models. We compare the clustering algorithm’s performance with the two best performing baselines (see the paper for more details):

To evaluate the quality of clustering, we use precision and recall. Precision is used to calculate the percentage of similar pairs out of all co-clustered pairs and recall is the percentage of co-clustered similar pairs out of all similar pairs. To measure the quality of the obtained solutions from our experiments, we use the F1-score, which is the harmonic mean of the precision and recall, where 1.0 is the highest possible value that indicates perfect precision and recall, and 0 is the lowest possible value that indicates if either precision or recall are zero. The table below reports the F1-score for Kwikbucks and various baselines in the case that we allow only a linear number of queries to the CA model. We show that Kwikbucks offers a substantial boost in performance with a 45% relative improvement compared to the best baseline when averaging across all datasets.

The figure below compares the clustering algorithm’s performance with baselines using different query budgets. We observe that KwikBucks consistently outperforms other baselines at various budgets.

A comparison of KwikBucks with top-2 baselines when allowed different budgets for querying the cross-attention model.


Text clustering often presents a dilemma in the choice of similarity function: embedding models are scalable but lack quality, while cross-attention models offer quality but substantially hurt scalability. We present a clustering algorithm that offers the best of both worlds: the scalability of embedding models and the quality of cross-attention models. KwikBucks can also be applied to other clustering problems with multiple similarity oracles of varying accuracy levels. This is validated with an exhaustive set of experiments on various datasets with diverse properties. See the paper for more details.


This project was initiated during Sandeep Silwal’s summer internship at Google in 2022. We would like to express our gratitude to our co-authors, Andrew McCallum, Andrew Nystrom, Deepak Ramachandran, and Sandeep Silwal, for their valuable contributions to this work. We also thank Ravi Kumar and John Guilyard for assistance with this blog post.


Zero-shot adaptive prompting of large language models

Posted by Xingchen Wan, Student Researcher, and Ruoxi Sun, Research Scientist, Cloud AI Team

Recent advances in large language models (LLMs) are very promising as reflected in their capability for general problem-solving in few-shot and zero-shot setups, even without explicit training on these tasks. This is impressive because in the few-shot setup, LLMs are presented with only a few question-answer demonstrations prior to being given a test question. Even more challenging is the zero-shot setup, where the LLM is directly prompted with the test question only.

Even though the few-shot setup has dramatically reduced the amount of data required to adapt a model for a specific use-case, there are still cases where generating sample prompts can be challenging. For example, handcrafting even a small number of demos for the broad range of tasks covered by general-purpose models can be difficult or, for unseen tasks, impossible. For example, for tasks like summarization of long articles or those that require domain knowledge (e.g., medical question answering), it can be challenging to generate sample answers. In such situations, models with high zero-shot performance are useful since no manual prompt generation is required. However, zero-shot performance is typically weaker as the LLM is not presented with guidance and thus is prone to spurious output.

In “Better Zero-shot Reasoning with Self-Adaptive Prompting”, published at ACL 2023, we propose Consistency-Based Self-Adaptive Prompting (COSP) to address this dilemma. COSP is a zero-shot automatic prompting method for reasoning problems that carefully selects and constructs pseudo-demonstrations for LLMs using only unlabeled samples (that are typically easy to obtain) and the models’ own predictions. With COSP, we largely close the performance gap between zero-shot and few-shot while retaining the desirable generality of zero-shot prompting. We follow this with “Universal Self-Adaptive Prompting“ (USP), accepted at EMNLP 2023, in which we extend the idea to a wide range of general natural language understanding (NLU) and natural language generation (NLG) tasks and demonstrate its effectiveness.

Prompting LLMs with their own outputs

Knowing that LLMs benefit from demonstrations and have at least some zero-shot abilities, we wondered whether the model’s zero-shot outputs could serve as demonstrations for the model to prompt itself. The challenge is that zero-shot solutions are imperfect, and we risk giving LLMs poor quality demonstrations, which could be worse than no demonstrations at all. Indeed, the figure below shows that adding a correct demonstration to a question can lead to a correct solution of the test question (Demo1 with question), whereas adding an incorrect demonstration (Demo 2 + questions, Demo 3 with questions) leads to incorrect answers. Therefore, we need to select reliable self-generated demonstrations.

Example inputs & outputs for reasoning tasks, which illustrates the need for carefully designed selection procedure for in-context demonstrations (MultiArith dataset & PaLM-62B model): (1) zero-shot chain-of-thought with no demo: correct logic but wrong answer; (2) correct demo (Demo1) and correct answer; (3) correct but repetitive demo (Demo2) leads to repetitive outputs; (4) erroneous demo (Demo3) leads to a wrong answer; but (5) combining Demo3 and Demo1 again leads to a correct answer.

COSP leverages a key observation of LLMs: that confident and consistent predictions are more likely correct. This observation, of course, depends on how good the uncertainty estimate of the LLM is. Luckily, in large models, previous works suggest that the uncertainty estimates are robust. Since measuring confidence requires only model predictions, not labels, we propose to use this as a zero-shot proxy of correctness. The high-confidence outputs and their inputs are then used as pseudo-demonstrations.

With this as our starting premise, we estimate the model’s confidence in its output based on its self-consistency and use this measure to select robust self-generated demonstrations. We ask LLMs the same question multiple times with zero-shot chain-of-thought (CoT) prompting. To guide the model to generate a range of possible rationales and final answers, we include randomness controlled by a “temperature” hyperparameter. In an extreme case, if the model is 100% certain, it should output identical final answers each time. We then compute the entropy of the answers to gauge the uncertainty — the answers that have high self-consistency and for which the LLM is more certain, are likely to be correct and will be selected.

Assuming that we are presented with a collection of unlabeled questions, the COSP method is:

Input each unlabeled question into an LLM, obtaining multiple rationales and answers by sampling the model multiple times. The most frequent answers are highlighted, followed by a score that measures consistency of answers across multiple sampled outputs (higher is better). In addition to favoring more consistent answers, we also penalize repetition within a response (i.e., with repeated words or phrases) and encourage diversity of selected demonstrations. We encode the preference towards consistent, un-repetitive and diverse outputs in the form of a scoring function that consists of a weighted sum of the three scores for selection of the self-generated pseudo-demonstrations.
We concatenate the pseudo-demonstrations into test questions, feed them to the LLM, and obtain a final predicted answer.

Illustration of COSP: In Stage 1 (left), we run zero-shot CoT multiple times to generate a pool of demonstrations (each consisting of the question, generated rationale and prediction) and assign a score. In Stage 2 (right), we augment the current test question with pseudo-demos (blue boxes) and query the LLM again. A majority vote over outputs from both stages forms the final prediction.

COSP focuses on question-answering tasks with CoT prompting for which it is easy to measure self-consistency since the questions have unique correct answers. But this can be difficult for other tasks, such as open-ended question-answering or generative tasks that don’t have unique answers (e.g., text summarization). To address this limitation, we introduce USP in which we generalize our approach to other general NLP tasks:

Classification (CLS): Problems where we can compute the probability of each class using the neural network output logits of each class. In this way, we can measure the uncertainty without multiple sampling by computing the entropy of the logit distribution.
Short-form generation (SFG): Problems like question answering where we can use the same procedure mentioned above for COSP, but, if necessary, without the rationale-generating step.
Long-form generation (LFG): Problems like summarization and translation, where the questions are often open-ended and the outputs are unlikely to be identical, even if the LLM is certain. In this case, we use an overlap metric in which we compute the average of the pairwise ROUGE score between the different outputs to the same query.

Illustration of USP in exemplary tasks (classification, QA and text summarization). Similar to COSP, the LLM first generates predictions on an unlabeled dataset whose outputs are scored with logit entropy, consistency or alignment, depending on the task type, and pseudo-demonstrations are selected from these input-output pairs. In Stage 2, the test instances are augmented with pseudo-demos for prediction.

We compute the relevant confidence scores depending on the type of task on the aforementioned set of unlabeled test samples. After scoring, similar to COSP, we pick the confident, diverse and less repetitive answers to form a model-generated pseudo-demonstration set. We finally query the LLM again in a few-shot format with these pseudo-demonstrations to obtain the final predictions on the entire test set.

Key Results

For COSP, we focus on a set of six arithmetic and commonsense reasoning problems, and we compare against 0-shot-CoT (i.e., “Let’s think step by step“ only). We use self-consistency in all baselines so that they use roughly the same amount of computational resources as COSP. Compared across three LLMs, we see that zero-shot COSP significantly outperforms the standard zero-shot baseline.

USP improves significantly on 0-shot performance. “CLS” is an average of 15 classification tasks; “SFG” is the average of five short-form generation tasks; “LFG” is the average of two summarization tasks. “SFG (BBH)” is an average of all BIG-Bench Hard tasks, where each question is in SFG format.

For USP, we expand our analysis to a much wider range of tasks, including more than 25 classifications, short-form generation, and long-form generation tasks. Using the state-of-the-art PaLM 2 models, we also test against the BIG-Bench Hard suite of tasks where LLMs have previously underperformed compared to people. We show that in all cases, USP again outperforms the baselines and is competitive to prompting with golden examples.

Accuracy on BIG-Bench Hard tasks with PaLM 2-M (each line represents a task of the suite). The gain/loss of USP (green stars) over standard 0-shot (green triangles) is shown in percentages. “Human” refers to average human performance; “AutoCoT” and “Random demo” are baselines we compared against in the paper; and “3-shot” is the few-shot performance for three handcrafted demos in CoT format.

We also analyze the working mechanism of USP by validating the key observation above on the relation between confidence and correctness, and we found that in an overwhelming majority of the cases, USP picks confident predictions that are more likely better in all task types considered, as shown in the figure below.

USP picks confident predictions that are more likely better. Ground-truth performance metrics against USP confidence scores in selected tasks in various task types (blue: CLS, orange: SFG, green: LFG) with PaLM-540B.

Zero-shot inference is a highly sought-after capability of modern LLMs, yet the success in which poses unique challenges. We propose COSP and USP, a family of versatile, zero-shot automatic prompting techniques applicable to a wide range of tasks. We show large improvement over the state-of-the-art baselines over numerous task and model combinations.


This work was conducted by Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan Ö. Arık, and Tomas Pfister. We would like to thank Jinsung Yoon Xuezhi Wang for providing helpful reviews, and other colleagues at Google Cloud AI Research for their discussion and feedback.

Do You Want To Boost Your Business?

drop us a line and keep in touch