Team Building: Indoor Office Games for Employees

Share This Post

Indoor office games are an effective and exciting tool for enhancing your team togetherness. Light up your employees’ moods with a dynamic, funny, and solid environment with these indoor office games!

With young millennials and Gen Z dominating the workforce, companies must transform their approach to building solid and productive teams. While formal engagement methods should be the gone days, games have emerged as an ideal solution for team building.

Indoor office games offer a break from the monotonous daily routine, injecting a dose of excitement into the office. These activities provide more than just a momentary diversion; they are powerful tools for enhancing communication and relationship building.

Let’s discover why indoor office games matter for employees. Along the way, we’ll explore a selection of game ideas ideally suited for the office environment. Keep scrolling!

Why indoor office games matter for team building

Unlike some previous generations, Millennials and Gen Z are digital natives. They grew up with technology and expect workplaces to be tech-savvy, flexible, and adaptable.

They highly value teamwork, collaboration, and work-life balance. So, creating a sense of belonging and community becomes essential while offering flexible work arrangements and well-being initiatives.

Indoor office games can bridge the generational gap in the workplace. Before diving into the game ideas, it’s essential to emphasise why team building is crucial for any organisation. Highlight the following points:

Reduce stress

Indoor office games inject fun into the workday. They allow employees to leave their desks, relax, and recharge. These moments of enjoyment can help combat burnout, reduce stress levels, and keep employees motivated.

Skill development

Many indoor office games promote vital skills such as communication, problem-solving, and creativity. These games can engage employees and enhance their capabilities for their day-to-day tasks.

Fostering collaboration

Collaboration is a core value for many millennials and Gen Z employees. Team-building activities encourage interaction and cooperation among team members, breaking down silos and promoting a sense of unity.

Improve communication

Communication is the backbone of a successful organisation. Team building activities encourage employees to communicate more openly and effectively.

Building trust

Trust is crucial in any workplace. Games that require team members to rely on each other, such as the trust fall or escape room challenges, help build trust and camaraderie within teams.

Indoor games can be an alternative to company retreats. Allocating one to two hours for hosting fun activities in your office can result in a positive community. Ultimately, you can improve employee relationships and build loyalty for a thriving business.

Indoor office game ideas for employees

Indoor office games offer diverse opportunities for team building, communication improvement, and enjoyable engagement, catering to different group sizes and objectives within your organisation.

This fun activity can also be great social media content to humanise your brand on online platforms. You can leverage video tools to make quick edits for your video content. Here are some indoor game ideas to try in your office.

#1. Charades

Time allocation: Depending on the number of participants, allocate 30 minutes to an hour for this game.

How to play: Divide participants into two teams. Each team takes turns choosing a team member to act out a word or phrase with their body while the rest of the team tries to guess what it is. A time limit, usually one or two minutes, adds excitement. Keep score to determine the winning team.

Perfect for: Charades foster creativity, non-verbal communication, and teamwork. It’s a great icebreaker suitable for smaller groups or teams looking to inject fun into their workday.

#2. Indoor Mini Golf

Time allocation: Plan for at least an hour, depending on the complexity of your indoor golf course.

How to play: Transform your office space into a mini-golf course using everyday objects as obstacles. Employees can play individually or in teams.

Provide golf clubs and balls. Each hole should have a par score. Employees take turns trying to complete the course with the fewest strokes.

Perfect for: Indoor mini-golf is ideal for a larger group of employees who enjoy a mix of skill, creativity, and friendly competition. It’s a unique way to promote creativity while engaging in a healthy rivalry.

#3. Office Olympics

Time allocation: Allocate half a day to a full day, depending on the number of events.

How to play: Design a series of office-themed challenges and events. These could include events like the “Sticky Note Dartboard Challenge” or the “Office Chair Relay Race.”

Employees form teams and compete in these fun, quirky events. Keep track of scores, and have a medal ceremony for the winning team.

Perfect for: Office Olympics are excellent for promoting teamwork, creativity, and physical activity. They work well for larger groups and are especially fun for company-wide events or team-building retreats.

#4. Back-to-back Drawing

Time allocation: Dedicate 45 minutes to an hour for this game.

How to play: Pair employees up, seating them back-to-back. Give one person a picture or image and hand the other a blank piece of paper and a pen.

The person with the image must describe it to their partner without revealing what it is. The partner must draw based on the description. Compare the drawings afterwards for laughter and fun.

Perfect for: Back-to-back drawing is a great icebreaker encouraging communication, active listening, and creativity. It’s suitable for small to medium-sized groups.

#5. Pictionary Telephone

Time allocation: Allow about 30-45 minutes for this game.

How to play: Have participants sit in a line facing each member’s back. Start by whispering a word or phrase to the first person, who then draws a picture representing that word.

The next person interprets the drawing and draws the same thing within a limited time. Each member continues drawing until the message reaches the last person, who then answers what the drawing is about.

Perfect for: The pictionary telephone is excellent for enhancing communication skills and creativity. It’s perfect for breaking the ice in medium-sized groups.

These indoor office games offer a delightful way to build teamwork, encourage creativity, and break the workday routine.

#6. Drawing in the Dark

Time allocation: Plan for approximately 30 minutes to an hour, depending on the number of rounds.

How to play: Pair employees and provide each pair with a drawing board, paper, and markers. Blindfold one team member and give them a simple object or concept to draw. The other member must guess the drawing by touch and verbal communication. Switch roles after each round.

Perfect for: Drawing in the dark is an excellent game for enhancing communication and creativity. It’s suitable for small to medium-sized groups and encourages employees to think outside the box.

#7. Whisper Challenge or Lip Reading

Time allocation: Dedicate 30-45 minutes for this game.

How to play: Pair up employees and have them sit facing each other. One person wears noise-cancelling headphones with music playing while the other person whispers a word or phrase.

The person with the headphones must lip-read and guess what their partner is saying. Rotate roles after each round.

Perfect for: This game is perfect for improving non-verbal communication, observation skills, and teamwork. It’s suitable for small to medium-sized groups and will bring plenty of laughter.

#8. Blind Folded Obstacle

Time allocation: Allocate around 45 minutes to an hour for this game.

How to play: Set up an obstacle course within your office space, using items like chairs, tables, and cushions as obstacles.

Blindfold participants and have them navigate the course with the guidance of a teammate who can only use verbal instructions. Time each team’s progress through the course.

Perfect for: The blindfolded obstacle course promotes trust, teamwork, and problem-solving. It’s ideal for small to medium-sized groups and can be a physically engaging and memorable experience.

#9. Course-Friendly Sport Match

Time allocation: Depending on the sport, allocate 30 minutes to an hour for each match.

How to play: Organise friendly sports matches within your office space. Options include table tennis, foot volleyball, or basketball.

Divide employees into teams and compete in a mini-tournament format. Keep track of scores and crown a champion.

Perfect for: Sports matches are excellent for team bonding, friendly competition, and physical activity. They are suitable for smaller groups and can be tailored to the preferences and space available.

#10. Cooking Challenge

Time allocation: Dedicate at least 2-3 hours for this challenge.

How to play: Divide employees into teams and provide them with a set of ingredients and cooking utensils. Each team must prepare a dish within a specified time frame.

Assign judges to taste and evaluate the dishes based on creativity, taste, and presentation. Announce a winning team.

Perfect for: This game encourages teamwork, creativity, and culinary skills. It’s best suited for small to medium-sized groups and can be a memorable and delicious team-building experience.

#11. Office Pictionary

Time allocation: Plan for 30 minutes – one hour for a full game session.

How to play: Divide employees into teams. Provide each team with a whiteboard or paper and markers.

Teams take turns selecting a word or phrase and having one team member draw it while the others guess what it is within a time limit. You can use a random word generator or prepare a list of office-related terms.

Perfect for: Office pictionary is a fun and creative game that enhances communication, teamwork, and problem-solving. It’s suitable for small to medium-sized groups and can be an entertaining addition to team meetings or social gatherings.

How often should you conduct indoor games in the office?

The frequency of indoor office games should align with the specific needs and preferences of your organisation and its employees. Regular assessment and feedback will help you fine-tune your approach over time to achieve the desired outcomes.

In general, it’s a good practice to have a mix of regular, smaller-scale activities (e.g., monthly or quarterly). Indoor office games can also be a good activity for larger-scale events (e.g., semi-annual or annual) to maintain a balance between team-building and daily work responsibilities. This can be a great way to promote your organisation worklife through compelling animated ads.

However, there are a few points to consider when you plan to conduct indoor office games as regular events in your company, such as:

  • Be mindful of employees’ workloads and time constraints. Frequent games that disrupt work too often can be counterproductive.
  • Collect feedback from employees after each game session to assess its impact.
  • Consider the budget and resources allocated for team-building activities. More frequent games may require more planning and resources.
  • Avoid excessive repetition of the same games to keep employees engaged. Rotate through a variety of activities to keep things fresh and exciting.

Moreover, the frequency of indoor games should align with your company’s culture. If your organisation strongly emphasises team building and employee engagement, you might schedule these activities more often.

If your workforce includes remote employees, consider virtual team-building activities conducted online. The frequency of such activities might differ from in-person games.

In Summary

Indoor office games play a pivotal role in nurturing a vibrant and motivated workforce. They go beyond mere entertainment, offering a pathway to stronger communication, enhanced collaboration, and a deeper sense of belonging among employees.

However, the key to reaping the full benefits lies in the art of selection and arrangement. By choosing activities that resonate with your team’s needs and objectives and arranging them strategically, you can create an environment where fun and productivity coexist harmoniously.

So, whether it’s a simple icebreaker or an elaborate team-building event, remember that the impact of these games reaches far beyond the immediate moment. It helps foster a workplace culture where employees thrive, innovate, and achieve their best collectively.

About the Author

Andre Oentoro is the founder of Breadnbeyond, an award-winning explainer video company. He helps businesses increase conversion rates, close more sales, and get positive ROI from explainer videos (in that order).

The post Team Building: Indoor Office Games for Employees appeared first on The 6Q Blog.

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore


Best of both worlds: Achieving scalability and quality in text clustering

Posted by Sara Ahmadian and Mehran Kazemi, Research Scientists, Google Research

Clustering is a fundamental, ubiquitous problem in data mining and unsupervised machine learning, where the goal is to group together similar items. The standard forms of clustering are metric clustering and graph clustering. In metric clustering, a given metric space defines distances between data points, which are grouped together based on their separation. In graph clustering, a given graph connects similar data points through edges, and the clustering process groups data points together based on the connections between them. Both clustering forms are particularly useful for large corpora where class labels can’t be defined. Examples of such corpora are the ever-growing digital text collections of various internet platforms, with applications including organizing and searching documents, identifying patterns in text, and recommending relevant documents to users (see more examples in the following posts: clustering related queries based on user intent and practical differentially private clustering).

The choice of text clustering method often presents a dilemma. One approach is to use embedding models, such as BERT or RoBERTa, to define a metric clustering problem. Another is to utilize cross-attention (CA) models, such as PaLM or GPT, to define a graph clustering problem. CA models can provide highly accurate similarity scores, but constructing the input graph may require a prohibitive quadratic number of inference calls to the model. On the other hand, a metric space can efficiently be defined by distances of embeddings produced by embedding models. However, these similarity distances are typically of substantial lower-quality compared to the similarity signals of CA models, and hence the produced clustering can be of much lower-quality.

An overview of the embedding-based and cross-attention–based similarity scoring functions and their scalability vs. quality dilemma.

Motivated by this, in “KwikBucks: Correlation Clustering with Cheap-Weak and Expensive-Strong Signals”, presented at ICLR 2023, we describe a novel clustering algorithm that effectively combines the scalability benefits from embedding models and the quality from CA models. This graph clustering algorithm has query access to both the CA model and the embedding model, however, we apply a budget on the number of queries made to the CA model. This algorithm uses the CA model to answer edge queries, and benefits from unlimited access to similarity scores from the embedding model. We describe how this proposed setting bridges algorithm design and practical considerations, and can be applied to other clustering problems with similar available scoring functions, such as clustering problems on images and media. We demonstrate how this algorithm yields high-quality clusters with almost a linear number of query calls to the CA model. We have also open-sourced the data used in our experiments.

The clustering algorithm

The KwikBucks algorithm is an extension of the well-known KwikCluster algorithm (Pivot algorithm). The high-level idea is to first select a set of documents (i.e., centers) with no similarity edge between them, and then form clusters around these centers. To obtain the quality from CA models and the runtime efficiency from embedding models, we introduce the novel combo similarity oracle mechanism. In this approach, we utilize the embedding model to guide the selection of queries to be sent to the CA model. When given a set of center documents and a target document, the combo similarity oracle mechanism outputs a center from the set that is similar to the target document, if present. The combo similarity oracle enables us to save on budget by limiting the number of query calls to the CA model when selecting centers and forming clusters. It does this by first ranking centers based on their embedding similarity to the target document, and then querying the CA model for the pair (i.e., target document and ranked center), as shown below.

A combo similarity oracle that for a set of documents and a target document, returns a similar document from the set, if present.

We then perform a post processing step to merge clusters if there is a strong connection between two of them, i.e., when the number of connecting edges is higher than the number of missing edges between two clusters. Additionally, we apply the following steps for further computational savings on queries made to the CA model, and to improve performance at runtime:

We leverage query-efficient correlation clustering to form a set of centers from a set of randomly selected documents instead of selecting these centers from all the documents (in the illustration below, the center nodes are red).

We apply the combo similarity oracle mechanism to perform the cluster assignment step in parallel for all non-center documents and leave documents with no similar center as singletons. In the illustration below, the assignments are depicted by blue arrows and initially two (non-center) nodes are left as singletons due to no assignment.

In the post-processing step, to ensure scalability, we use the embedding similarity scores to filter down the potential mergers (in the illustration below, the green dashed boundaries show these merged clusters).

Illustration of progress of the clustering algorithm on a given graph instance.


We evaluate the novel clustering algorithm on various datasets with different properties using different embedding-based and cross-attention–based models. We compare the clustering algorithm’s performance with the two best performing baselines (see the paper for more details):

To evaluate the quality of clustering, we use precision and recall. Precision is used to calculate the percentage of similar pairs out of all co-clustered pairs and recall is the percentage of co-clustered similar pairs out of all similar pairs. To measure the quality of the obtained solutions from our experiments, we use the F1-score, which is the harmonic mean of the precision and recall, where 1.0 is the highest possible value that indicates perfect precision and recall, and 0 is the lowest possible value that indicates if either precision or recall are zero. The table below reports the F1-score for Kwikbucks and various baselines in the case that we allow only a linear number of queries to the CA model. We show that Kwikbucks offers a substantial boost in performance with a 45% relative improvement compared to the best baseline when averaging across all datasets.

The figure below compares the clustering algorithm’s performance with baselines using different query budgets. We observe that KwikBucks consistently outperforms other baselines at various budgets.

A comparison of KwikBucks with top-2 baselines when allowed different budgets for querying the cross-attention model.


Text clustering often presents a dilemma in the choice of similarity function: embedding models are scalable but lack quality, while cross-attention models offer quality but substantially hurt scalability. We present a clustering algorithm that offers the best of both worlds: the scalability of embedding models and the quality of cross-attention models. KwikBucks can also be applied to other clustering problems with multiple similarity oracles of varying accuracy levels. This is validated with an exhaustive set of experiments on various datasets with diverse properties. See the paper for more details.


This project was initiated during Sandeep Silwal’s summer internship at Google in 2022. We would like to express our gratitude to our co-authors, Andrew McCallum, Andrew Nystrom, Deepak Ramachandran, and Sandeep Silwal, for their valuable contributions to this work. We also thank Ravi Kumar and John Guilyard for assistance with this blog post.


Zero-shot adaptive prompting of large language models

Posted by Xingchen Wan, Student Researcher, and Ruoxi Sun, Research Scientist, Cloud AI Team

Recent advances in large language models (LLMs) are very promising as reflected in their capability for general problem-solving in few-shot and zero-shot setups, even without explicit training on these tasks. This is impressive because in the few-shot setup, LLMs are presented with only a few question-answer demonstrations prior to being given a test question. Even more challenging is the zero-shot setup, where the LLM is directly prompted with the test question only.

Even though the few-shot setup has dramatically reduced the amount of data required to adapt a model for a specific use-case, there are still cases where generating sample prompts can be challenging. For example, handcrafting even a small number of demos for the broad range of tasks covered by general-purpose models can be difficult or, for unseen tasks, impossible. For example, for tasks like summarization of long articles or those that require domain knowledge (e.g., medical question answering), it can be challenging to generate sample answers. In such situations, models with high zero-shot performance are useful since no manual prompt generation is required. However, zero-shot performance is typically weaker as the LLM is not presented with guidance and thus is prone to spurious output.

In “Better Zero-shot Reasoning with Self-Adaptive Prompting”, published at ACL 2023, we propose Consistency-Based Self-Adaptive Prompting (COSP) to address this dilemma. COSP is a zero-shot automatic prompting method for reasoning problems that carefully selects and constructs pseudo-demonstrations for LLMs using only unlabeled samples (that are typically easy to obtain) and the models’ own predictions. With COSP, we largely close the performance gap between zero-shot and few-shot while retaining the desirable generality of zero-shot prompting. We follow this with “Universal Self-Adaptive Prompting“ (USP), accepted at EMNLP 2023, in which we extend the idea to a wide range of general natural language understanding (NLU) and natural language generation (NLG) tasks and demonstrate its effectiveness.

Prompting LLMs with their own outputs

Knowing that LLMs benefit from demonstrations and have at least some zero-shot abilities, we wondered whether the model’s zero-shot outputs could serve as demonstrations for the model to prompt itself. The challenge is that zero-shot solutions are imperfect, and we risk giving LLMs poor quality demonstrations, which could be worse than no demonstrations at all. Indeed, the figure below shows that adding a correct demonstration to a question can lead to a correct solution of the test question (Demo1 with question), whereas adding an incorrect demonstration (Demo 2 + questions, Demo 3 with questions) leads to incorrect answers. Therefore, we need to select reliable self-generated demonstrations.

Example inputs & outputs for reasoning tasks, which illustrates the need for carefully designed selection procedure for in-context demonstrations (MultiArith dataset & PaLM-62B model): (1) zero-shot chain-of-thought with no demo: correct logic but wrong answer; (2) correct demo (Demo1) and correct answer; (3) correct but repetitive demo (Demo2) leads to repetitive outputs; (4) erroneous demo (Demo3) leads to a wrong answer; but (5) combining Demo3 and Demo1 again leads to a correct answer.

COSP leverages a key observation of LLMs: that confident and consistent predictions are more likely correct. This observation, of course, depends on how good the uncertainty estimate of the LLM is. Luckily, in large models, previous works suggest that the uncertainty estimates are robust. Since measuring confidence requires only model predictions, not labels, we propose to use this as a zero-shot proxy of correctness. The high-confidence outputs and their inputs are then used as pseudo-demonstrations.

With this as our starting premise, we estimate the model’s confidence in its output based on its self-consistency and use this measure to select robust self-generated demonstrations. We ask LLMs the same question multiple times with zero-shot chain-of-thought (CoT) prompting. To guide the model to generate a range of possible rationales and final answers, we include randomness controlled by a “temperature” hyperparameter. In an extreme case, if the model is 100% certain, it should output identical final answers each time. We then compute the entropy of the answers to gauge the uncertainty — the answers that have high self-consistency and for which the LLM is more certain, are likely to be correct and will be selected.

Assuming that we are presented with a collection of unlabeled questions, the COSP method is:

Input each unlabeled question into an LLM, obtaining multiple rationales and answers by sampling the model multiple times. The most frequent answers are highlighted, followed by a score that measures consistency of answers across multiple sampled outputs (higher is better). In addition to favoring more consistent answers, we also penalize repetition within a response (i.e., with repeated words or phrases) and encourage diversity of selected demonstrations. We encode the preference towards consistent, un-repetitive and diverse outputs in the form of a scoring function that consists of a weighted sum of the three scores for selection of the self-generated pseudo-demonstrations.
We concatenate the pseudo-demonstrations into test questions, feed them to the LLM, and obtain a final predicted answer.

Illustration of COSP: In Stage 1 (left), we run zero-shot CoT multiple times to generate a pool of demonstrations (each consisting of the question, generated rationale and prediction) and assign a score. In Stage 2 (right), we augment the current test question with pseudo-demos (blue boxes) and query the LLM again. A majority vote over outputs from both stages forms the final prediction.

COSP focuses on question-answering tasks with CoT prompting for which it is easy to measure self-consistency since the questions have unique correct answers. But this can be difficult for other tasks, such as open-ended question-answering or generative tasks that don’t have unique answers (e.g., text summarization). To address this limitation, we introduce USP in which we generalize our approach to other general NLP tasks:

Classification (CLS): Problems where we can compute the probability of each class using the neural network output logits of each class. In this way, we can measure the uncertainty without multiple sampling by computing the entropy of the logit distribution.
Short-form generation (SFG): Problems like question answering where we can use the same procedure mentioned above for COSP, but, if necessary, without the rationale-generating step.
Long-form generation (LFG): Problems like summarization and translation, where the questions are often open-ended and the outputs are unlikely to be identical, even if the LLM is certain. In this case, we use an overlap metric in which we compute the average of the pairwise ROUGE score between the different outputs to the same query.

Illustration of USP in exemplary tasks (classification, QA and text summarization). Similar to COSP, the LLM first generates predictions on an unlabeled dataset whose outputs are scored with logit entropy, consistency or alignment, depending on the task type, and pseudo-demonstrations are selected from these input-output pairs. In Stage 2, the test instances are augmented with pseudo-demos for prediction.

We compute the relevant confidence scores depending on the type of task on the aforementioned set of unlabeled test samples. After scoring, similar to COSP, we pick the confident, diverse and less repetitive answers to form a model-generated pseudo-demonstration set. We finally query the LLM again in a few-shot format with these pseudo-demonstrations to obtain the final predictions on the entire test set.

Key Results

For COSP, we focus on a set of six arithmetic and commonsense reasoning problems, and we compare against 0-shot-CoT (i.e., “Let’s think step by step“ only). We use self-consistency in all baselines so that they use roughly the same amount of computational resources as COSP. Compared across three LLMs, we see that zero-shot COSP significantly outperforms the standard zero-shot baseline.

USP improves significantly on 0-shot performance. “CLS” is an average of 15 classification tasks; “SFG” is the average of five short-form generation tasks; “LFG” is the average of two summarization tasks. “SFG (BBH)” is an average of all BIG-Bench Hard tasks, where each question is in SFG format.

For USP, we expand our analysis to a much wider range of tasks, including more than 25 classifications, short-form generation, and long-form generation tasks. Using the state-of-the-art PaLM 2 models, we also test against the BIG-Bench Hard suite of tasks where LLMs have previously underperformed compared to people. We show that in all cases, USP again outperforms the baselines and is competitive to prompting with golden examples.

Accuracy on BIG-Bench Hard tasks with PaLM 2-M (each line represents a task of the suite). The gain/loss of USP (green stars) over standard 0-shot (green triangles) is shown in percentages. “Human” refers to average human performance; “AutoCoT” and “Random demo” are baselines we compared against in the paper; and “3-shot” is the few-shot performance for three handcrafted demos in CoT format.

We also analyze the working mechanism of USP by validating the key observation above on the relation between confidence and correctness, and we found that in an overwhelming majority of the cases, USP picks confident predictions that are more likely better in all task types considered, as shown in the figure below.

USP picks confident predictions that are more likely better. Ground-truth performance metrics against USP confidence scores in selected tasks in various task types (blue: CLS, orange: SFG, green: LFG) with PaLM-540B.

Zero-shot inference is a highly sought-after capability of modern LLMs, yet the success in which poses unique challenges. We propose COSP and USP, a family of versatile, zero-shot automatic prompting techniques applicable to a wide range of tasks. We show large improvement over the state-of-the-art baselines over numerous task and model combinations.


This work was conducted by Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan Ö. Arık, and Tomas Pfister. We would like to thank Jinsung Yoon Xuezhi Wang for providing helpful reviews, and other colleagues at Google Cloud AI Research for their discussion and feedback.

Do You Want To Boost Your Business?

drop us a line and keep in touch