From "1 Team's Success" to Company-Wide Standard: NTT DOCOMO's Strategy for Scaling MagicPod Across 100 Products

NTT DOCOMO

Mobile app testing Browser testing

We sat down with NTT DOCOMO to hear about what led them to choose MagicPod and how their usage has evolved since implementation. The conversation was led by MagicPod CEO Nozomi Ito.

NTT DOCOMO

The DOCOMO Group operates across three core businesses: telecommunications, Smart Life and enterprise services. The Smart Life business handles all consumer-facing services beyond connectivity, with the Product Design Division responsible for developing and maintaining over 100 products; including d POINT, d Card, d Barai, Lemino, d Anime Store, d Healthcare, DOCOMO Denki, DOCOMO Gas and Hikari TV.

KEY POINTS

Manual E2E testing was a bottleneck slowing development, causing release delays
MagicPod was selected for three reasons: pricing at roughly one-quarter of competitors, support for local device testing and CI/CD integration
Test execution effort reduced from 23 person-days to 3 person-days, with ROI achievable in just 3 release cycles
What started as a cost-cutting initiative also produced an unexpected improvement in quality
Using the AI agent "MagicPod Autopilot," test case creation time was reduced by 66%

(Photo caption, left to right)
• Akari Idei — Consumer Services Company, First Product Design Division, Service Infrastructure Team
• Chikara Mitsui — Consumer Services Company, Senior Principal Architect
• Fumitaka Ueda — Consumer Services Company, First Product Design Division, Video Services, Second Technical Development Team
• Nozomi Ito — MagicPod CEO

Chikara : I serve as Senior Principal Architect across both the First and Second Product Design Divisions. I was originally the head of the Product Design Division, but now I take on a technical consulting and direction-setting role that spans both divisions.

DOCOMO's business runs on three pillars — telecommunications, Smart Life and enterprise, and our First and Second Product Design Divisions are responsible for developing and operating the Smart Life products.

Of the 100+ products we handle, about 60% are developed using agile methodologies, and E2E regression testing tends to become a bottleneck.

So when Akari's Base App team introduced MagicPod around May of last year as part of a test automation initiative and got solid results, and then the large-scale Lemino QA team also saw strong outcomes, we shared those findings at an internal tech-sharing session. Since then, we've been getting inquiries from teams beyond those two saying they want to adopt MagicPod as well.

Akari : I work in service infrastructure at the First Product Design Division. Our team doesn't deliver services directly to end users — instead, we develop internal platforms that provide shared functionality across various services and apps.

Within that, my team develops the "Base App," which allows service websites to be turned into apps quickly and at low cost. It comes with DOCOMO-specific features like authentication and push notifications built in, and can be customized relatively easily — currently deployed across seven services.

I serve as the product owner. I originally joined NTT DOCOMO Solutions (formerly NTT Comware) as an engineer working on DOCOMO-facing development. Around 2022 I transferred to DOCOMO and took on my current role. MagicPod was my first experience with automated testing — until then, we'd been doing E2E and regression testing entirely by hand.

Fumitaka : I joined as a new graduate in 2024, and from my very first year I've been serving as product owner on the video playback library development team. I got involved with Lemino starting in January 2025 — it was a moment when the team was looking to push test automation forward, and I was invited to join and raised my hand to participate.

As a product owner, I'd been involved in test design, but the actual automation had been left to the development team, so I didn't have any expertise myself. I was genuinely starting from zero, but now I lead the test automation team.

Nozomi: On the development structure — with this many products, I imagine you've also pushed quite far on in-house development?

Chikara: We only have a few hundred employees, yet we have over 100 products, so it's simply not possible to staff all development with our own people — even if everyone were a product owner. The basic model is to work with partners under contract-based arrangements, developing together as one integrated team. The Lemino team, for example, does have DOCOMO employees who write code, so I often use the term "semi-in-house" to describe it.

Akari Idei — Consumer Services Company, First Product Design Division, Service Infrastructure Team Chikara Mitsui — Consumer Services Company, Senior Principal Architect Fumitaka Ueda — Consumer Services Company, First Product Design Division, Video Servi

Challenges Before Adopting MagicPod

Akari: Our team handles everything from development through operations and maintenance with a single team, which made it difficult to allocate resources to testing. There were two main challenges.

The first was rework caused by running tests at the end of the process. Ideally, we'd run regression and device variation tests for each PBI (Product Backlog Item), but in practice the workflow had evolved into batching multiple sprints together and testing in the later stages. As a result, bugs from earlier development — including device- and OS-specific issues — would all surface at once, sometimes causing significant rework.

The second was development slowdowns from manual testing. The Base App is used across many DOCOMO services, so development speed is critical, but the effort required for manual testing kept becoming a bottleneck.

Nozomi: Wasn't there some benefit to batching the tests together?

Akari: Absolutely — doing everything at the end does reduce effort in a sense. But if a bug surfaces during testing after two months of development, and it traces back to something implemented at the very start, you end up with major rework that inflates the total effort anyway. There was also a growing sense of unease within the team about saving all the testing for the end, and building confidence in the process was a real challenge.

Fumitaka: In Lemino's case, there are multiple development teams alongside a separate QA team, and improving development speed while reducing costs were the big themes. As we pushed toward agile, the testing phase was the particular bottleneck — all E2E testing before a release was done manually, making the effort and costs enormous. Concretely, when bugs were found late in testing, there wasn't enough time to fix them before the release date, and the release itself would get pushed back.

Chikara: The Lemino flow was to batch multiple sprints of development, run E2E regression, then release — and MagicPod has been compressing that cycle. Akari's Base App team has gone even further, building regression testing into the development process at the PBI level and running it continuously — a much closer approximation of the ideal. When you catch a bug, you can fix it right away, and business agility goes up.

For a large service like d-Barai, you could bring in specialists and automate testing with Appium, but asking every team to shoulder that kind of maintenance burden isn't realistic. That's why we're using the results from Akari and Fumitaka's teams as success cases to drive MagicPod adoption more broadly.

Chikara Mitsui — Consumer Services Company, Senior Principal Architect

Why They Chose MagicPod

Chikara: Before tools like MagicPod came along, teams working on mobile app automation had basically one option: Appium. Then no-code tools started appearing, and there was a sense that things could be done much more easily.

For DOCOMO, the first question was always whether a tool could meet our security regulations. We tested a few tools that cleared requirements around local execution and password handling, but MagicPod was far and away the strongest in terms of usability across both browser and mobile.

Akari: Our team also evaluated several tools including Appium, but they all fell short on cost or resource requirements, so MagicPod was the only one we actually trialed. We started the trial around May of last year and moved into production use by July.

Beyond cost, key selection factors were that the UI was intuitive from a product owner's perspective, and that we were already using GitHub as our repository, so being able to integrate with GitHub Actions for CI/CD was significant. Being able to set up a flow where automated tests run and results are ready the moment development finishes was really valuable.

Nozomi: How much did the no-code aspect factor into the decision?

Akari: Since the development team members are the ones actually using it, the no-code aspect itself wasn't a huge priority. That said, being able to check test status through the GUI as a product owner is genuinely helpful — in that sense, I do appreciate the no-code benefits.

Also, the Base App makes heavy use of WebView, so the fact that MagicPod handles WebView without issues has been a real help.

Fumitaka: For Lemino, we compared three tools including MagicPod. When we did a rough cost estimate for Lemino's scale, MagicPod came in at roughly one-quarter of the competition — a significant gap.

On the feature side, support for local execution was important since we want to test under conditions as close to the actual user environment as possible. Our targets include the mobile app, browser and TV-based platforms. Excluding TV, MagicPod was the only tool that covered every combination — local or cloud, mobile app or web browser.

The Lemino app is built in Flutter, and the trial revealed two challenges. One was that Flutter's architecture caused multiple widgets to be recognized as a single UI element, making it hard to reliably tap specific buttons. The other was that locators differed between Android and iOS, meaning we'd need separate test cases for each.

However, the MagicPod help documentation laid out a solution: by using Flutter's Semantics widget to assign unique IDs to each UI element, you can use a common locator for both Android and iOS and cover everything with a single test case.

Nozomi: Was the development team cooperative? I often hear that QA teams run into resistance when making requests of developers.

Fumitaka: Since assigning IDs is development work that isn't directly tied to functionality, I think the initial reaction was lukewarm. So we shared the problem and the expected benefits with the development team, making the case that test case maintenance effort could be cut roughly in half. That explanation got them on board, and from there the process moved to the development team proposing ID assignment plans that we would then review.

Now, whenever a new screen or feature is added, the development team proactively discusses the need for ID assignment during agile refinement sessions and handles it autonomously.

Nozomi: That's impressive. Was getting internal approval for MagicPod straightforward?

Akari: We presented the business case based on our trial results. Automation was already an area the entire Product Design Division had committed to advancing, so there was no pushback and the adoption went through smoothly.

Chikara: I had also been consistently telling people at internal tech-sharing sessions that without automated testing, no matter how much you optimize everything else, E2E testing will remain the bottleneck.

Through all of that advocacy, one thing that stood out — beyond just the low adoption cost — was the positive impact on quality. We started with delivery speed and cost efficiency as the goals, but quality improved as a result too.

With a large app, a specialist might spend one to two days determining the scope of regression testing and narrowing down which areas to cover, and occasionally that judgment call is wrong. With automated testing you can just run everything without narrowing scope, and since the number of test executions goes up, you catch bugs that would never have surfaced manually. That actually happened with Lemino.

At the "In-House Development Summit" I attended last year, more than half of the companies presenting had the MagicPod icon in their slides. That gave me real confidence we were heading in the right direction. People who have ownership over their tooling choices tend to make trustworthy ones.

Akari Idei — Consumer Services Company, First Product Design Division, Service Infrastructure Team

How They're Using MagicPod

Akari: By automating regression and device variation tests that previously required a lot of manual effort, we've reduced testing effort by around 50%. We've achieved real shift-left — being able to redirect the time freed up by automation back into development has been a big deal.

Fumitaka: As of January 2026, nearly all automatable items in our app and web browser regression testing have been automated. With MagicPod we can run tests at roughly ten times the speed of manual testing, which has meaningfully accelerated the cycle of catching bugs early and fixing them before release — and we've also managed to catch bugs that would have been missed manually.

In our most recent mobile app release, we ran approximately 1,800 test cases and brought test execution effort down from 23 person-days to 3 person-days.

Chikara: Because Akari's setup is contained within a single team, the efficiency gains from testing flow directly back into development. In Lemino's case, where QA is a separate team with partner testers, the reduction in effort translates directly to reduced partner costs — a different flavor of the same benefit. Either way, the math shows that adoption costs pay for themselves within three release cycles, which makes it a particularly strong fit for scrum teams with high release frequency.

Fumitaka: Operationally, we run regression tests before each release and push results to Slack so the entire QA team can review them. The main focus is on failed cases — we analyze the cause, and if it's an app-side bug, we escalate to the development team to decide whether to fix it. Failed cases that turn out to be genuine new bugs are actually quite rare across the full regression suite, so escalations to the dev team don't happen all that often.

For Lemino, at the moment we're running from the command line rather than through GitHub Actions integration. The browser interface only lets you run one case at a time, but the command line allows parallel execution, so that's what we use. CI integration is something we want to pursue going forward.

Nozomi: On the topic of testing on real devices — have you looked into integrating with NTT Group's Remote TestKit?

Chikara: It's come up before. We see a MagicPod integration as something we'll definitely need eventually, and there are teams that have bought large numbers of physical devices for testing, so we'd love to expand to those teams as well.

Nozomi: Connecting MagicPod to Remote TestKit is as simple as switching the execution target — no extra setup. It eliminates the need to physically connect devices each time, and it's also a solid solution for increasing the volume of test runs. If you're interested, please feel free to reach out to our CS team.

Using the Test Automation Agent "MagicPod Autopilot"

Fumitaka: More recently we've been using MagicPod Autopilot when creating test cases. Concretely, we take the test items from Lemino's test suite, create test case information and a prompt to pass to Autopilot, feed that in and let it generate the test cases. The goal is an operational model where team members only need to review and adjust what the AI has produced.

We started out with around 20 prompt templates organized by test item type, but creating those templates was taking a lot of time in itself, so the next step we're working toward is automating the prompt creation as well.

Nozomi: What's the most important benefit you're expecting from Autopilot?

Fumitaka: The biggest one is reducing test case creation time. Beyond that, we're also thinking about whether integrating with the MCP server would let us run Autopilot in parallel — if so, the speed gains would increase further. On the quality side, test cases created by people inevitably have inconsistency in quality, and we're hopeful that Autopilot can bring them to a more consistent baseline.

Chikara: Looking at Autopilot's impact in numbers: initially it produced a 21% reduction in test case creation time. As we developed a better understanding of Autopilot's tendencies and refined our prompt templates, that rose to 31%. Then, when we started using Cline and the MagicPod MCP server to automatically generate prompts informed by existing test content and test perspectives, we reached a 66% reduction.

Fumitaka: Honestly, the quality of the generated test cases does vary — some only need one or two steps of adjustment, while others need a full overhaul before they'll run. But reviewing and adjusting is still vastly faster than building from scratch, so the efficiency gains are very real.

Chikara: We're also experimenting with combining Cline and the MagicPod MCP server to investigate the cause of test failures and suggest fixes. The accuracy rate is still around 50%, but we believe this kind of automated analysis will become essential going forward.

Nozomi: On MagicPod's end, we're actively developing a root cause analysis feature — the plan is that it will eventually allow you to make changes to related tests based on the analysis output. Modifying existing tests via the MCP server is already possible.

Chikara: Speaking of AI-based root cause analysis — this actually came up at a system monitoring event I spoke at recently. The idea was that combining MagicPod with the Datadog or New Relic MCP could be interesting: by cross-referencing test failure data from MagicPod with performance data from Datadog or New Relic, you could do root cause analysis at the full system level, not just at the application layer.

Nozomi: MagicPod doesn't capture performance data during test execution, so integrating with something like New Relic to layer performance information onto test results sounds really compelling. I think it opens up a world where you can see everything end-to-end, from client to backend.

Fumitaka: One question for you, Nozomi — tools like Playwright MCP have been emerging recently, where even people who can't write code can just give instructions to an AI and have tests created. Where does MagicPod's advantage lie in that landscape?

Nozomi: What Playwright MCP generates is code, so you need someone who can read code to judge whether it's correct, and modifications go through prompts, which tends to make the whole thing a black box. On top of that, there's prerequisite knowledge required — Git operations, setting up agent environments — which creates a high barrier to rolling it out across an entire QA team. The ability for a wide range of people to use something intuitively is where no-code tools continue to have real value.

Chikara: We actually have a specialist team that uses Appium and Playwright for production environment monitoring, but they're a self-contained team with people who can write code, so a code-based approach works for them. Asking everyone across all product teams — including people who don't write code — to do the same thing is just not realistic. That's precisely why I advocate for MagicPod.

Fumitaka Ueda — Consumer Services Company, First Product Design Division, Video Services, Second Technical Development Team

In Closing

Akari: Adopting automated testing can feel like a high barrier, and the larger the service, the higher the adoption cost. That said, automated testing is an area that's only going to get more attention going forward, so I'd encourage starting small and expanding from there. That's exactly how our team was able to produce results.

Fumitaka: The strongest point of MagicPod, in my view, is the support. When evaluating tools, cost and features tend to get the most attention, but when you think about using something long-term, the quality of support matters enormously. There's an active Slack community where MagicPod and its users feel like they're on the same team. Seeing what's happening on the development side is also motivating for us. If you're trialing it, I'd really encourage you to reach out to support and get a feel for the MagicPod community.

Chikara: The genuine feeling is that it improves all of QCD — quality, cost and delivery. Usually improving one means sacrificing another, but automated testing is a trade-on, not a trade-off.

When I talk to people who've done automated testing, they all say the same things: "once you've done it, you can't go back" and "I don't understand why other teams aren't doing this." In the end, there's just an experience and information gap between those who have and those who haven't. The maintenance overhead is real, but what you get in return far outweighs it. My hope is that by leveraging MagicPod, teams can take the pain out of regression testing and keep growing into teams that deliver more and more value.

References
MagicPod User Meetup 2026 —
Chikara's presentation materials [NTT DOCOMO] Improving Development Efficiency with MagicPod at DOCOMO — and What Came Along With It
MagicPod Blog Award —
Special Prize Winner, Fumitaka's Blog How We Unified Android/iOS Automated Testing with MagicPod × Flutter and Cut Automation Costs by ~50%

NTT DOCOMO

Corporate site: https://www.DOCOMO.ne.jp/english/corporate/
DOCOMO Developer Blog: https://nttDOCOMO-developers.jp/