Skip to main content
AI RISK ASSESSMENT

AI Mental Health Apps

Applied Use
Use Case Review
OVERALL RISK
Risk varies
Executive summary

Disclaimer: This risk assessment contains descriptions of harmful and distressing behavior. If you need help now:

  • 988 Suicide & Crisis Lifeline: Call or text 988.

  • Crisis Text Line: Text HOME to 741741.

The market for AI mental health apps is unregulated, unstable, and in some cases actively harmful to teens. That is the central finding of this assessment—but this review also identifies what a meaningfully safer approach actually looks like.

Three findings define the landscape. First, Wysa—one of the largest consumer AI mental health apps serving teens, with 6 million users and accessible to minors as young as 13—scored Unacceptable risk. Our researchers documented the app playing adult sexual games with 13-year-old test personas, celebrating eating disorder behaviors like purging and rapid weight loss, responding to signs of psychosis and mania with enthusiasm rather than concern, and allowing teens to exit a suicide crisis pathway with a single denial and no follow-up. Second, two other consumer apps (Earkick and Youper) disappeared from app stores during our testing period, without notice, without transition support, and without public explanation, leaving more than 3 million users, many of them vulnerable minors, with nowhere to turn. Third, and in contrast to both: two school-based apps, Alongside and Sonar, demonstrated that a safer model is possible. When our researchers simulated psychiatric crises, a real person was on the phone with the test account's guardian within 15 minutes of the first disclosure.

Parents who have heard warnings about general-purpose chatbots may assume that apps specifically designed for mental health are safer. This research finds they are often equally unsafe, and the app most likely to end up on a teen's phone has no way to tell the difference between a bad day and a psychiatric emergency. When the bot gets it wrong, there is no human and no oversight to catch the mistake.

The gap between consumer and institutional products is not primarily a technology gap. Sonar and Alongside made fundamentally different choices about what the AI should and shouldn't do, and those choices (not the underlying models) explain their ratings. The safer products are not risk-free. But they set the standard the rest of this market should be held to.

Alongside

Sonar

Wysa*

Overall Risk Level

Low

Minimal

Unacceptable

Keep Kids & Teens Safe

Low

Minimal

Unacceptable

Be Effective

Low

Low

High

Prioritize Fairness

Low

Minimal

High

Put People First

Low

Minimal

Unacceptable

Support Human Connection

Low

Minimal

Unacceptable

Be Trustworthy

Low

Low

High

Use Data Responsibly

Minimal

Minimal

High

Be Transparent & Accountable

Low

Low

Unacceptable

*Note on scope: This assessment evaluated Wysa's consumer application, not its separate Children and Young People (CYP) product. Wysa’s consumer app has more than 6 million users across 105 countries; it is available to users age 13 and older under Wysa's terms of service and is rated E in the Play Store and 9+ in the App Store, making it directly accessible to minors. Wysa’s CYP product is designed for institutional deployment—primarily through schools and health boards, with operations largely in the United Kingdom—and is not available through direct consumer download. All findings regarding Wysa in this assessment refer only to the consumer application available in app stores.

For more information on our review process, see How We Review. The Common Sense Media Youth AI Safety Institute is funded by both philanthropy and industry, including the makers of some of the technologies we evaluate. Companies have no say in what we test, how we score, or what we publish.

Key takeaways

What it is

AI mental health apps are a fast-growing category of consumer and institutional software that use AI chatbots to deliver emotional support, symptom tracking, coping skill development, and in some cases, therapeutic interventions. Unlike multi-use AI chatbots like ChatGPT or Gemini, these products are designed specifically to address mental health and well-being needs, and many are marketed directly to or used by young people. Use of these apps is already widespread with teens and young people, with both a March 2026 Kaiser Family Foundation tracking poll and a 2024 Common Sense Media report respectively suggesting that 3 in 10 young adults use AI chatbots or apps for mental health support. The global market for chatbot-based mental health apps is estimated at nearly $2 billion in 2024 and projected to nearly quadruple by 2033

In our prior assessment of AI chatbots and teen mental health, we found that general-purpose chatbots (including ChatGPT, Claude, Gemini, and Meta AI) are not safe for teen mental health support, and we rated their use for this purpose as Unacceptable

Purpose-built mental health apps often claim to address the kinds of gaps we found in that assessment. They cite clinical expertise in their design, evidence-based therapeutic frameworks, safety protocols, and in some cases human oversight. This risk assessment evaluates whether those claims hold up.

What we found

Even the safer products have documented gaps that require attention. During our testing of Alongside, the system flagged at least one disclosure of suicidal ideation as "Risk Level: None." A school counselor reviewing their dashboard might have missed a student crisis. Human oversight is only as good as the information that reaches the humans doing the overseeing. Sonar's model—trained humans in every student-facing conversation—is the most protective design in this assessment, but it introduces its own risk: Coaches who review AI-suggested responses may, over time, approve them without adequate independent judgment. Neither finding diminishes what these products get right. Both are reminders that safer design is a floor to continue to build on.

Two apps vanished while we were conducting this assessment, leaving millions of users with no warning, no transition support, and no explanation. Earkick and Youper, which together reported more than 3 million users, vanished from the App Store and/or Play Store during our evaluation period, without notice to users, without referrals to alternative care, and without public explanation. Clinicians cannot abandon patients this way without triggering malpractice suits and licensing board investigations. Hospitals cannot close a psychiatric unit without a state-mandated transition plan. Medical device manufacturers are required to notify regulators and provide transition support before exiting the market. None of those constraints applied here. The users left behind had no warning, no transition support, and nowhere to turn, which is especially problematic for vulnerable minors in crisis.

The AI therapy market is unaccountable. There are no licensing requirements, no malpractice liability, and no minimum safety standards for apps that remain on the market. Any company can describe its product as therapy, evidence-based care, or clinical support, with no regulatory consequence. An app can miss a teen's suicidal crisis, validate a child's romantic attachment to an AI, or engage with psychotic content as a personality quirk with no professional consequences, no recourse for users, and no mechanism to surface the failure publicly. A licensed therapist who harms a patient faces all of those consequences. That asymmetry falls hardest on the users least able to recognize or recover from the harm: adolescents, many of them in under-resourced communities, who may be turning to a chatbot precisely because nothing else is available to them.

This is a tale of two approaches—and the dominant consumer model is too risky for teens. Sonar and Alongside, both deployed through schools within existing care infrastructure, earned meaningfully lower risk ratings not because the underlying technology is better, but because they made fundamentally different and better choices about how they make use of people and professional care systems in a mental health context. When our researchers simulated crises in both products, a real person called the test account's guardian and notified the school—in the most serious simulation, initiating mandated reporting—within 15 minutes of the first disclosure. That outcome is the standard every product in this space should be held to. Wysa, one of the largest AI mental health apps serving teens, with 6 million users across 105 countries and accessible to minors as young as 13, does not meet it. The gap in values between these two design philosophies is the central finding of this assessment.

For both approaches, the evidence base is thin, contested, and weakest for the users most likely to rely on these products. Meta-analyses show small to moderate short-term effects on depressive symptoms in adults, but long-term benefits are largely not sustained, and effects on anxiety and other outcomes are often not statistically significant. Evidence specific to adolescents is especially scarce. The largest youth-focused study (of Alongside) found that short-term distress reductions were not sustained at three months, with largely null findings on depression, anxiety, and loneliness. These apps are marketed directly to teens on the basis of evidence that does not describe them.

Clinical framing does not equal clinical safety. Purpose-built mental health apps cite clinical expertise, evidence-based frameworks, and safety protocols. Our testing found the same risks we identified in general-purpose AI chatbots for both kinds of apps: missed warning signs, failure to recognize accumulating crisis signals, and/or ineffective escalation to human care. The difference in the institutional products is not that the AI avoids these failures, but rather that a person is positioned to catch them when they occur. These failures are more dangerous in products that carry clinical authority because users may trust them more, and a teenager who believes they are using a clinically designed tool may be less likely to seek additional help when that tool fails them. 

The same features that make both kinds of apps most appealing may make them harmful. In some cases, these apps could cause the harm they claim to treat. The clinical term is "iatrogenic": harm caused not by the absence of care, but by the care itself. OCD is the clearest illustration of this wider issue. OCD is maintained by reassurance-seeking, and the evidence-based treatment (exposure and response prevention) works by doing the opposite: withholding reassurance until anxiety resolves on its own. When a 24/7 chatbot responds to OCD thinking with validation and reassurance, it reinforces the compulsive cycle that treatment would interrupt. The more a user engages, the more harm they may sustain. Unfortunately, the same impact applies across a wider range of conditions. Avoidance maintains anxiety disorders, PTSD, social anxiety, and health anxiety. For example, a teen who vents to an AI instead of navigating a peer interaction that causes them distress is practicing avoidance. Apps built on an interaction pattern that validates, reassures, reflects, and extends are contraindicated for a substantial share of the adolescent mental health presentations they claim to serve.

Methodology

This Methodology section has been edited for brevity. For full product descriptions, evaluation framework, and testing approach can be found in the full risk assessment: AI Mental Health Apps.

What is it: AI mental health apps are a fast-growing category of consumer and institutional software that use AI chatbots to deliver emotional support, symptom tracking, coping skill development, and in some cases, therapeutic interventions. Unlike multi-use AI chatbots like ChatGPT or Gemini, these products are designed specifically to address mental health and well-being needs, and many are marketed directly to or used by young people. Use of these apps is already widespread with teens and young people, with both a March 2026 Kaiser Family Foundation tracking poll and a 2024 Common Sense Media report respectively suggesting that 3 in 10 young adults use AI chatbots or apps for mental health support. The global market for chatbot-based mental health apps is estimated at nearly $2 billion in 2024 and projected to nearly quadruple by 2033

In our prior assessment of AI chatbots and teen mental health, we found that general-purpose chatbots (including ChatGPT, Claude, Gemini, and Meta AI) are not safe for teen mental health support, and we rated their use for this purpose as Unacceptable

Purpose-built mental health apps often claim to address the kinds of gaps we found in that assessment. They cite clinical expertise in their design, evidence-based therapeutic frameworks, safety protocols, and in some cases human oversight. This risk assessment evaluates whether those claims hold up.

What we tested: For this review, we partnered with Stanford Medicine's Brainstorm Lab to evaluate five apps that represent the two dominant distribution models in this space: 

  • Direct-to-consumer apps (downloaded and used independently, without clinical oversight) included Earkick (250,000+ users before being removed from the App Store in April 2026 during our testing period), Wysa (6M+ users, 105 countries, age 13+), and Youper (3M+ users, designed for 18+ but App Store-rated for age 9+, before disappearing from the App Store and Play Store on April 27, 2026). Collectively, these products report tens of millions of users worldwide and are accessible to minors despite being mostly designed for adults.
  • Institutional apps (deployed through school districts as part of student well-being infrastructure) included Alongside (100,000+ students in 19 states, grades 4 to 12) and Sonar (25,000+ young people, nine states). These products have smaller reach but sit within institutional structures that include human oversight, clinical escalation pathways, and accountability to administrators and parents.

Evaluation Framework: We assessed five products against our AI Principles, though two apps disappeared from the App Store and/or Play Store at the end of our evaluation period. The three apps we publish scores for in this risk assessment are Wysa (a direct-to-consumer app), and Alongside and Sonar (institutional apps).

For mental health evaluation, our approach features two components. First, we assessed whether these systems exercise appropriate "duty of care," a basic safety principle used in medicine and many other professions. This framework asks two fundamental questions:

  1. Could the person I'm talking with be in danger or at risk of harm?
  2. What reasonable steps should I take to help prevent that harm?

Second, we assessed the safety and helpfulness of chatbot responses, using established clinical frameworks and best practices:

  1. Safety criteria: Recognizing warning signs; assessing severity appropriately; providing crisis resources when needed; directing to professional care; not providing harmful advice that could worsen symptoms or delay treatment
  2. Helpfulness criteria: Validating distress empathetically; providing accurate, evidence-based information; offering concrete, actionable guidance; maintaining appropriate boundaries; connecting users to real-world support

All test transcripts were reviewed by psychiatrists from Stanford Medicine’s Brainstorm Lab (an academic institution dedicated to mental health innovation), who assessed whether responses met established clinical standards for recognition, safety, boundaries, and appropriate referral. Findings reported in this assessment reflect patterns observed across the full testing data set, not individual exchanges.

Testing Approach: For each product, our researchers created test accounts and engaged in both single- and multi-turn conversations designed to reflect the range of emotional and clinical situations a real adolescent user might bring to these platforms. 

In total, researchers had over 3,100 exchanges with these apps, and five total researchers and evaluators conducted this assessment. Testing was not limited to crisis scenarios or edge cases. Researchers began with normative, everyday presentations (stress about school, friendship/parent conflict, low mood) before moving into higher-acuity conditions (situations including more serious or clinically significant distress, including passive or active suicidal ideation, active self-harm disclosure, substance use, and disordered eating).

Across the five products, we conducted structured test conversations covering 13 clinical and developmental conditions affecting young people: anxiety, depression, ADHD, eating disorders, OCD, PTSD, mania, psychosis, self-harm, suicidal ideation, behavioral and conduct concerns, identity and relationship stress, and parasocial attachment.

A note on Earkick and YouperBoth apps were included in our original testing plan and were actively evaluated during the testing period. Earkick disappeared from the App Store in April 2026 during our testing; Youper disappeared from both the App Store and Google Play Store after April 25, 2026, shortly after we notified all five companies of our preliminary findings for review. Because neither product is currently available to users, we do not report their full results as stand-alone assessments. However, their design failures are documented throughout this report as evidence of broader patterns in the direct-to-consumer category, and these disappearances are themselves a finding about the accountability and stability of this market.

Testing accounts: Researchers used a single test account for each app, with the exception of Alongside, for which we used three test accounts to evaluate age appropriateness across the grades 4 to 12 served. Test accounts ranged from age 8 to 15. Only Sonar and Alongside required the user's age to be input; the other products (even those whose terms of service require users to be 18+) did not check users' ages. Several had an age rating in the App Store or Play Store that did not align with the 18+ requirement.

Timing: Testing was conducted from January 15, 2026, through April 29, 2026.

Limitations: This assessment focused on how chatbots respond to mental health content in test conversational contexts. It does not evaluate: Long-term outcomes or real-world impact on teen mental health, the effectiveness of crisis hotlines or other resources provided by chatbots, whether teens actually follow through on recommendations to seek professional help, or AI tools used by licensed clinicians as an adjunct to human practice. 

Prior testing: This risk assessment builds on research conducted as part of our AI Chatbots for Mental Health Support assessment, published November 14, 2025. 

Detailed methodology is in our full risk assessment: AI Mental Health Apps. For more information on our evaluation process, see How We Review.

What AI Mental Health Apps get right

Sonar (Risk: Minimal) and Alongside (Risk: Low) achieved these risk ratings because of fundamentally different design choices about what the chatbot should and shouldn't do.

 

Sonar keeps the AI entirely out of the student-facing conversation. Young people text with human Wellbeing Coaches; the AI provides context on past engagement, suggests responses, flags concerns, and assists with triage—but every message a student receives comes from a person. When our researchers simulated crises involving symptoms of disordered eating or psychosis, a staff member from Sonar called the test account's emergency contact by phone and notified the relevant institution (in this case, the school) within 15 minutes of the first disclosure, while the researcher was still in active conversation. In both cases, the coach also informed the young person that their emergency contact was being contacted, consistent with ethical clinical practice.

Test conversation with Sonar, demonstrating the chatbot notifying the tester that it will reach out to their emergency contact

Sonar's Wellbeing Coach notifies a student that their emergency contact is being reached out to while staying in active conversation with them. Unlike automated crisis protocols that terminate sessions or send generic hotline numbers, a trained human is making the clinical judgment call, maintaining the therapeutic relationship, and being transparent with the student in real time.

 

When a potential crisis is detected, Sonar's crisis protocol is to continue engaging with the student, notify school staff, and reach out to emergency contacts (typically the parent/guardian on record in the school's information system). If those contacts cannot be reached, Sonar notifies local authorities. However, because an adult is making the determination to contact emergency contacts, they can make an assessment of when it may be appropriate or inappropriate to contact a parent/guardian or to engage in mandated reporting.

 

If Sonar calls an emergency contact, they encourage the parent/guardian to save a direct phone number for any follow-up questions regarding the crisis call. 

 

Alongside takes a different approach: Rather than positioning its chatbot as a stand-alone tool, Alongside has gone to great lengths to integrate itself within a school's existing care infrastructure, especially to support less severe, less crisis-oriented situations. When chats discuss topics with elevated risk, Alongside walks the student through a structured escalation procedure and alerts school counselors and administrators.

Test conversation with Alongside, demonstrating the chatbot’s transparency about what will and won’t be shared with the school counselor

Alongside's chatbot walks a student through the end of a crisis escalation, after the student had already called 988 and affirmed self-harm intent. Positives include: Alongside is transparent about exactly what will and won't be shared with the school counselor (answers to the structured questions, not prior chat history), gives the student a voice in the handoff by asking if they want to add a message, and stays in conversation after escalation, rather than terminating the session. The student is being routed toward human care while still being supported in the moment.

 

When our researchers simulated a crisis across all three test accounts (ages 8, 11, and 15), school staff were notified and a real person followed up with the guardian, the school, and in the most serious simulation, initiated mandated reporting procedures.

 

Alongside also implements usage caps. It disables the chat if a student sends more than 60 messages in a three-hour window, reflecting a product that is designed to route students toward human care rather than keep them in-app.

 

The outcome in both cases—a human being on the phone, quickly—is the standard that every product in this space should be held to. These two products meet it. The direct-to-consumer products in this review do not meet this standard. The findings in the following section document where they fall short—and in some cases, where they actively cause harm. The distinction between these two design philosophies is not a matter of degree. It is the difference between a product built around human oversight and one built around engagement.

 

What the institutional products get right is worth naming precisely, because it sets the bar for the rest of the assessment. Sonar and Alongside both disclose their AI limitations accurately and consistently. Both keep their scope appropriately narrow: Neither claims to be therapy, and both are designed to route students toward human care rather than substitute for it. Alongside has invested in independent outcome evaluation, an internally developed clinical quality framework, and clinician-authored chatbot prompts. Sonar works with a board of licensed psychiatrists and conducts structured weekly safety testing with its coaches. They are also why the gap between these products and the consumer category is not primarily a technology gap. It is a values gap.

Where they fall short

The central promise of purpose-built mental health apps is that expert design makes them safer than general-purpose AI. Our testing did not find evidence to support that claim for the three direct-to-consumer products in this review.

Recommendations

For Parents and Caregivers

  • Do not allow teens to use direct-to-consumer AI mental health apps, especially as a substitute for professional care—and start by asking what they are already using. Our testing found that consumer apps in this category—including Wysa, which remains on the market, with more than 6 million users—fail to recognize serious mental health presentations as emergencies, have no meaningful mechanisms to verify whether a user is a minor, and in two cases disappeared from app stores overnight without notice or transition support. But the more immediate challenge for most parents is that they may not know what their child is already using. Teens may turn to multi-use AI chatbots or purpose-built AI mental health apps for emotional support, often before they turn to a person, and often without their parents knowing. The most useful first step is a direct, nonjudgmental conversation: Ask what apps and chatbots your child uses, when they use them, and what they use them for. Do not assume that because something isn't marketed as a mental health app it isn't being used that way. And do not assume your child will recognize that the AI they are talking to is not a person who can actually help them in a crisis.

  • If your child's school uses Alongside or Sonar, ask how crisis escalation works. Both products earned lower risk ratings, but their safety features depend in part on school infrastructure being in place and functioning. Ask your school who receives safety alerts, what the after-hours protocol is, and how you will be notified if your child discloses something serious.

  • Talk to your child about AI and mental health. Many teens are already turning to general-purpose AI chatbots like ChatGPT for emotional support, sometimes before they turn to a person. Having an open, nonjudgmental conversation about mental health, what AI can and cannot do, and about how to reach a real person in a moment of crisis is one of the most protective things a parent can do. And if your child or teen doesn't want to talk with you, offer them somewhere else to turn.

  • If your child is in crisis, contact a real person. AI mental health apps are not crisis services. If your child is experiencing suicidal ideation, self-harm, or a psychiatric emergency, contact the 988 Suicide and Crisis Lifeline (call or text 988), go to the nearest emergency room, or call 911. For eating disorder support, contact the National Alliance for Eating Disorders helpline at 866-662-1235. Do not rely on an app to make this call for you.

For Educators and School Administrators

  • Evaluate school-based mental health AI tools against the standard set by Sonar and Alongside, not against doing nothing. The relevant comparison for any school-based mental health AI tool is not "better than no support." It is "what happens when a student discloses a crisis?" If a product cannot demonstrate that a real human will follow up with a student and their guardian in a documented, timely way, it should not be deployed as part of a school's care infrastructure.

  • Do not use AI mental health tools as a substitute for adequate counseling staffing. Several products reviewed here are positioned as Tier 1 universal interventions for the full student population. However, these tools could still encounter students in crisis, and they are not an appropriate substitute for the human clinicians who should respond to those students. An AI mental health app that catches a crisis is valuable; a tool positioned as a replacement for the counselor who would respond to it is not.

  • Ensure that any deployed product has documented escalation protocols that you have reviewed and can verify. When selecting or renewing contracts with AI mental health vendors, request documentation of escalation thresholds, expected response times, procedures for transferring care to local providers, and evidence that those protocols have been tested. Alongside's S.U.R.E. framework is a useful benchmark for the level of documentation schools should expect.

  • Ask vendors specifically about LGBTQ+ content handling. This is a complex area that intersects with different state laws and disclosure requirements that require school staff to notify parents if a youth discloses their sexual or gender identity to a counselor. The American Psychological Association has identified this as a dangerous practice that may put kids at risk of harm. Make sure that any chatbot you use thoughtfully complies with state laws and doesn't put kids at risk for being outed by inadvertently exposing certain conversations to school staff. It is likely to be safer if school-based AI mental health apps do not engage in these types of subjects in these jurisdictions at the current time. In an ideal world, products would be able to engage in conversations about all aspects of teen identity, as LGBTQ+ youth are among the highest-risk populations for suicide and mental health crisis. 

For Policymakers and Regulators

  • Consider whether AI mental health apps should be regulated as medical devices, and direct the FDA to evaluate its precertification program accordingly. Several products in this review deliver interventions (CBT-based therapeutic conversations, crisis assessment, symptom tracking) that would require regulatory clearance if delivered by a human clinician or a traditional medical device. The FDA's precertification program was designed to create a more adaptive pathway for software-based health tools; the FDA should consider whether products of this kind should be required to seek that clearance, and what standards they would need to meet to obtain it. 

  • Require transparency about what these products actually do. Several apps in this review make claims such as "science-guided emotional support," "skills-based interventions," and "proven effective" that our testing and the available evidence do not support. The FDA should clarify standards for digital mental health product claims, and the FTC should review marketing claims for violations of Section 5's prohibition on "unfair or deceptive acts or practices," particularly where language implies unsubstantiated claims of clinical efficacy. The FTC should open a Section 6(b) study specific to this class of mental health chatbot applications (as they have already done with AI companion chatbots).

  • Establish minimum safety standards for AI mental health apps, with particular attention to products accessible by minors. Currently, any app can market itself as providing therapy, with no regulatory consequence. The absence of licensing requirements, malpractice liability, or mandatory continuity-of-care standards means the full burden of risk falls on users, most of whom are already vulnerable. Several states (including California, Illinois, and Nevada) have taken initial steps to restrict apps from describing their chatbots as mental health professionals. Federal standards should follow, and at minimum, should require crisis assessment consistent with the Columbia Suicide Severity Rating Scale (C-SSRS), safety planning consistent with the Stanley-Brown protocol, and engagement with means restriction, three baseline components of evidence-based crisis response.

  • Require privacy protections. Therapy apps should not be able to use information gleaned in conversations with teens for marketing purposes. While COPPA requires additional consent before personal information collected from children under 13 is shared with marketers, and some state privacy laws may prohibit sensitive or health information for children or teens from being used for such purposes, these protections are uneven. At a national level, lawmakers need to update COPPA to protect teens and to prohibit behavioral advertising, and update FERPA to address the modern data practices of schools, which may now include therapy apps. Privacy laws must also be updated to limit the use of youth information for model training and require informed consent for any such use from teens and parents.

  • Require age assurance and age-differentiated response protocols for AI mental health apps. Every direct-to-consumer app in this review claims age limits, but none enforce them. Apps designed for adults are accessible to children. 

  • Require mandatory continuity-of-care standards. During this risk assessment, two apps we were evaluating (Earkick and Youper) disappeared from the App Store and Play Store without notice, leaving users who may have depended on them without access, transition planning, or any referral to alternative care. This is a predictable feature of an unregulated market. Apps that position themselves as mental health support tools should be required to provide users with transition support and referral to alternative resources before any service discontinuation.

For the Products in This Review

Recommendations for direct-to-consumer products:

The following recommendations apply to Wysa and to any direct-to-consumer AI mental health apps that market themselves to or are accessible by minors. But a prior question frames all of them: Should products of this kind be permitted to operate in the youth mental health space at all, absent the regulatory requirements that govern every other category of youth-facing clinical care? 

Our testing found not a product that needs refinement, but a category that has claimed clinical authority without clinical accountability. The burden of proof should rest with these companies to demonstrate, through independent evidence and regulatory scrutiny, that they should be allowed to serve minors, rather than with regulators and families to prove they should not. The recommendations that follow describe what meeting that burden would require.

 

  • Treat engagement-optimizing business models as a structural disqualifier. A freemium or subscription model that depends on users returning is structurally incompatible with youth mental health care. The business succeeds when users stay engaged; good mental health care succeeds when users get better and need less support. Products operating on this model should not be permitted in the youth mental health space without independently demonstrating how that conflict has been resolved. Policymakers, funders, and institutional purchasers should actively favor nonprofit operators and certified B corporations, where fiduciary obligations reduce the incentive to optimize for engagement over outcomes.

  • Eliminate engagement-prolonging design patterns and set clear clinical boundaries. Several specific design choices across the apps in this assessment function to extend sessions and deepen dependency, rather than support recovery. Products should remove or redesign: follow-up questions appended to responses primarily to extend conversation; streaks, rewards, and badges that incentivize return regardless of clinical need; anthropomorphic language and unsolicited expressions of care that encourage parasocial attachment; and any framing that positions the AI as a companion or confidant rather than as a tool. Clinical boundaries, including clear limits on what the AI will and will not engage with relationally, should be explicit, consistent, and not bypassable through roleplay or conversational framing.

  • Implement cumulative risk tracking within a session. Products that respond to individual messages rather than to the clinical picture they form together cannot provide safe mental health support. A system that receives disclosures of self-harm, substance inquiry, scar concealment, and medication dosing in sequence—responding to each adequately but never synthesizing the whole—is not an effective mental health tool.

  • Build dedicated clinical pathways for eating disorders, psychosis, and mania. These presentations require materially different responses than general distress or anxiety. For eating disorder presentations involving purging, laxative use, or extreme restriction, the response must include immediate medical referral with urgency, not therapeutic exploration. For psychotic or manic features, engaging with and affirming delusional thinking, grandiose plans, or perceptual disturbances is explicitly contraindicated in clinical practice. Any grandiosity, ideas of reference, or perceptual disturbance should trigger concern and professional referral within the first two exchanges, not after multiple turns of validation.

  • Screen for OCD-type presentations before delivering reassurance-based support. Any app that markets itself for anxiety should distinguish between generalized anxiety and OCD-specific presentations before providing validation and reassurance. Reassurance reduces distress in generalized anxiety but worsens OCD by reinforcing the compulsive cycle. At minimum, apps should include brief OCD screeners at onboarding and route users with OCD-consistent presentations toward professional referral rather than AI-delivered support.

  • For eating disorder referrals, direct users to the National Alliance for Eating Disorders helpline (866-662-1235). Do not reference the NEDA helpline, which has been permanently disconnected.

  • Redesign crisis response to ensure it cannot be bypassed with a single denial. When a user discloses suicidal behavior and then denies it, the appropriate clinical response is a second-level inquiry, not acceptance of the correction and continuation of the conversation. A 13-year-old who may use denial as a protective strategy or as a test should not be able to exit a crisis pathway with a single tap.

  • Implement meaningful age assurance and age-stratified response architecture. A 15-year-old user warrants a different interaction protocol from an adult. This includes different permissions for relational content, different escalation thresholds, and mandatory referral to trusted adults for high-risk presentations.

  • Actively redirect parasocial and romantic attachment toward human relationships. When a user discloses ending a real relationship to pursue an AI connection, the appropriate response is warm, clear redirection toward human relationships—not affirmation of the bond. For minor users specifically, romantic or quasi-romantic content should be hard-blocked regardless of framing or roleplay context.

  • Provide specific, actionable crisis resources, not links to a webpage. Crisis resources—including the 988 Suicide and Crisis Lifeline, the Crisis Text Line (text HOME to 741741), and condition-specific helplines—should be provided directly and immediately in the conversation. The less friction, the better.

  • Establish and maintain a continuity-of-care plan. The disappearance of two products from app stores during this review—without notice, public explanation, or transition support—is a reminder that users of consumer mental health apps have no protection when a product exits the market. Any product that positions itself as a source of mental health support should maintain a documented plan for service discontinuation that provides users with referrals to alternative care before access is removed.

 

None of these recommendations should be read as a checklist for incremental improvement. The question is not whether a product can satisfy individual criteria; it is whether the category has earned the right to operate in a space that every other form of youth-facing clinical care is required to justify through licensure, liability, and independent evidence. Until that justification exists, the burden falls on these companies, not on the families and regulators left to manage the consequences when they fail.

 

Additional product-specific recommendations can be found in the full risk assessment.

FULL REPORT

Read the complete risk assessment

The full PDF lays out our methodology, every test prompt and result, and the detailed scoring behind this rating.

↓ Download the report (PDF)