Mar 12 2026
Secure Mobile Frameworks
Listening time: 47 mins

Guest(s): Tanu Jain, Security Engineer; Alex Kube, Software Engineer

At Meta, even seemingly simple engineering tasks—like updating an API—become monumental undertakings when you're dealing with millions of lines of code and thousands of engineers, especially if the changes are security-related. Pascal talks to Alex and Tanu about the challenges and learnings in making mobile frameworks at Meta more secure at a scale that few companies ever experience. Explore the compelling crossroads of security, automation, and AI within mobile development.


Transcript:

Pascal: Adding a new API rep is hardly worth a podcast episode. So why are we here? Because seemingly ordinary engineering tasks can come prohibitively hard once you reach Meta scale with tens of millions of lines of code, thousands of engineers committing to a single monitor repository every day and thousands of call sites for the API in question. To discuss how we keep our app safe and secure by performing code modes at a mind boggling scale, I'm joined by two engineers on the product security team, Alex and Tanu, welcome to Meta Tech Podcast. Tanu, maybe I can start with you before we jump into the actual topic. How long have you been at Meta and what did you do before?

Tanu: I'm part of Meta for about seven and a half years. Uh, I have worked in different roles throughout this journey, um, starting from corporate security to web security, and right now my focus is on WhatsApp, specifically Android and iOS. Um, before, before joining Meta, I've worked in various industries as full stack developer, so my experience is both full stack development as well as security. Um, my transition was mainly when I thought of pursuing masters in cybersecurity from Northeastern. That's when I came to U.S. and right after that I joined Meta.

Pascal: Fantastic. You definitely had, uh, quite a look around the company that that's gonna be super helpful for this conversation. Alex, can I pass it on to you?

Alex: Yeah, absolutely. Um, I've been at Meta for close to six years now. Uh, primarily focused on product security roles the entire time. Uh, the teams that I've worked with have been mostly horizontal in nature, so we've tried to design these sorts of frameworks that support all of the different app verticals, like WhatsApp, Instagram, Facebook, and so on. Uh, prior to Meta I did a variety of like public and private contracting and did some work with a drone startup actually. So trying to connect drones to cell networks and figuring out how to do that securely was actually one of the things that got me interested in cybersecurity as a topic and, uh, I've had an opportunity, uh, both there and at Meta to sort of pursue that passion, which has been a lot of fun.

Fascinating. That must have been quite the change then going from the more kind of hardware role into really focusing deeply on specific software issues.

Alex: Yeah, absolutely. I mean, that was a little more full stack. Um, you know, in terms of like, yeah, like you said, we were all the way from the hardware trying to figure out, you know, the cell network connection all the way up to like the user facing, uh, software where people would actually like, connect their drones to the, um, the software and, you know plan their missions and everything like that. But I think it is, uh, it's a lot of fun to be able to focus more specifically on the security here at Meta. And uh, certainly like you said, that's given us a lot more of a depth of knowledge and a depth of application that we didn't have elsewhere.

Pascal: Let’s talk about your team. So is there a way of quickly summarizing your team’s mission?

Tanu: Yeah, absolutely. So we’re part of Central Security. Uh, specifically if you talk about the whole log structure, the product security, uh, the mission of our team is to focus on protecting user and their data from cybersecurity threat actors. Um, that also includes adversaries who are attempting to harm those users by exploiting weaknesses in various areas. That includes like product design, implementation, um, configuration and similar mechanisms. So our job is to protect user data and the exploitation that can happen in all of our, uh, Facebook family of apps, Facebook, Instagram, WhatsApp, and others.

Pascal: Fabulous. And we have a really specific issue here, so we don't need to talk in these kind of broad strokes, but can focus on one particular example. You recently published a blog post, and I wanted to go a little deeper on it. You talked about some issues with Android's default intent system, and I think even the folks who are not Android.

Even the folks who are not Android engineers will probably understand the whole mechanism, how you basically construct little URL like things that you can pass around to tell the app, open this, we open this activity. So not too dissimilar to link, but can you talk about the specific problem that you needed to tackle?

What exactly was it that made the default intent system insecure in air quotes for us?

Alex: Yeah, absolutely. Uh, so as you mentioned, uh, you know, Android's intent system is a way of passing data between apps in a structured manner, and, uh, the name of the system itself sort of gives you an idea to the function. You know, you express an intent that you want to do something and then the system will figure out how to actually go and do that on your behalf.

Uh, so there's kind of two types of intents. There's implicit and explicit intents. An implicit intent would be something where I say, I wanna send a text message, or I wanna send an email, or open web browser, or something like that. Um, where we literally just basically give it a topic and the system will say like, okay, I have a list of apps that can handle this.

I'll either let the user select one or use some sort of heuristic on my side to select which app is actually going to do this. Um, an explicit intent is, you know, sort of the same thing, but you might, as a developer give a little more information to the system where you'd say like, okay, um, I want to.

Open, uh, this email message with a specific email app where you can say like, okay, I care about the package. Um, but the thing that's always been kind of missing with this system, even as Android's evolved it to make it more secure over time, is the ability to really pin these, uh, sorts of intents to specific publishers.

Um, so you could imagine like, you know, Facebook is published by Meta, but, uh, the way that you install packages on Android means that, you know, if. You install Facebook from? Meta the first time around, uh, you know, you'll have a relatively secure experience and you'll be using our version of the Facebook app.

And we're relatively trusting of, you know, sharing data from Instagram to Facebook as long as the user is explicitly consented to that. But the problem that we have is that there's nothing that really prevents someone from side loading an app. Pretends to be Facebook that's not published by Meta, that when you try to share something from Instagram to Facebook, could then, you know, upload some sensitive data to some website or perform any other action that you might have on the behalf of the users.

So really the problem that we're trying to solve is to take this intent system and sort of add an extra layer of security on this where we can make sure that we're actually not just. Communicating with the apps that we think are published by Meta, but we can just communicate with the apps that are verifiably published by Meta, right?

Pascal: And from Google's perspective or potentially even pre-Google times, it makes sense that this was potentially not a consideration of the intent API, because as far as I'm aware, this was around from the very first version that had a public SDK. So it is one of the oldest APIs in the Android, SDK, and somebody can fact check me on this, but I worked on some very early Android apps and I definitely remember writing intents in there.

Alex: Yes. Yeah, no, you're absolutely correct. I mean, I think that this, and it's a very foundational piece of, you know, the Android operation operating system because of that. So like, you know, even when Google wants to make changes and makes changes. Um, you know, it can be very difficult to do that in a way that doesn't, you know, massively break app.

Uh, so the changes that they make tend to be relatively incremental and, you know, still need to support, use cases where you don't have things like the Google Play Store installed for like, uh, you know, developer verification or anything like that.

Pascal: Right, and the whole problem is not too dissimilar to what you have on the web, or you have URLs and somebody could manipulate your DNS response and send you to the wrong Facebook.com. But the way we fixed this on the web is by having certificates and. I'm sure we will get to this, but can you talk about what is the fix?

Then actually, how do you ensure that the party that you're talking to the other rep is actually the one you're expecting to be?

Alex: Uh, well, it's actually really interesting that you brought up the web analogy because we also rely on certificates on the Android side. So, uh, when an app gets published, uh, the developer signs that with their private signing key, and then there's a certificate that gets embedded in the APK that's you know, delivered to the, uh, user's device.

And then, uh, via the Android package manager APIs, we can, you know, look up for a particular package name, which certificate was stored when that app was installed, and then we can compare that against a known list of these certificates that correspond to our private keys and use that as sort of the basis for doing that, uh, verification that we're communicating with the Meta published version of the app that we expect.

Pascal: That makes total sense and it sounds relatively simple. It is honestly a little surprising that there isn't at least a rep that Google offers that does the same thing, but potentially we also need to talk about one of the differences that we have to many others. Usually you're basically having. Your internal traffic and your external traffic. But as you've described in particular, what is different for us is that we have a lot of cross app communication. So we have the family of apps and they need to be able to securely talk to each other and pass data between the two. So is that the main differentiation between our kind of problem set and what you see in the, I guess like outside world?

Tanu: I think, um, Meta is very diverse in the sense that we have a lot of apps which are consumer facing, so billion of users are using Facebook, Instagram, and WhatsApp. Um, a very clear example, or like a good one would be, uh, account center linking where users are given, uh, this account center where they can link all of their Facebook, WhatsApp, Instagram accounts together so that they don't have to remember, like passwords of each one of it.

And then they can leverage, basically, uh, they are logged into, let's say Facebook, they can easily log into Instagram or WhatsApp. So this is a feature which was recently released. And um, in order to have this functionality working, there needs to be a lot of communication between all of these apps where, uh, there would be tokens passed around and that's where. Um, this unique infrastructure unique problem comes in, um, just because how Meta operates and how user data is passed around. That's why we need to be much more careful about this inter process communication and need to make it much more secure, uh, I think in industry. The same problem might exist in other areas as well, uh, maybe with Google or other companies.

But I think, uh, just because here it's much more at scale and uh, um, and the data which is passed around is much more critical. That's where we need to make it, uh, much more secure and prevent, like intent hijacking and spoofing kind of attacks.

Pascal: Yea, that makes total sense.

Alex: Yeah, and I, I think this also sort of, you know, speaks to, you know, sort of, uh, the design of the intent system and everything like that. And you know, why some of these features might not exist in the standard libraries is because, you know, for Google, I think as a platform publisher or any platform publisher, you wanna make sure that these APIs, uh, lead to a good user experience at the end of the day. Um, you know, for a company like Meta or Google or Microsoft or some of these other companies that have, you know, larger families of apps, you know, we have a small handful of signing keys and we expect a lot of traffic between those apps. So, you know, we need something or we want something where, you know, we rely on that.

But for your average developer where you don't know. Which app is going to handle your request to send an email or a text message or something like that. You really don't want them trying to get down in the weeds about like which specific app or especially which signing key is going to be used for that app.

Because like that's just gonna lead to a lot of breakages as developers rotate their signing keys as new apps coming on and off the market and everything like that. So I think like, you know, this is a unique problem to a small subset of companies, you know, as Tahu mentioned, where like we really do have, uh, you know, a lot of traffic. Between our apps and we know exactly which set of apps that is going to be.

Pascal: Yeah, it reminds me a bit of the old stories about trying to make encrypted emails work, and it was so easy. You just need to generate your private and your public key and upload it, your key server, and then go to go to a signing party with some other people in exchange keys and so simple, right? No, it's like, obviously most people just want to get messages across and in the same way, most developers just want to open a web browser with the intent and don't need the additional overhead of managing certificates as well, so, okay. I feel like that is a fairly nicely defined problem. What was your solution on the API Surface? So what do your developers need to change instead of using the standard intent system of ensuring that they securely communicate with other apps?

Alex: Yeah, absolutely. Uh, so basically what we did is we wrapped the intent sending APIs with our own set of, um, you know, functionality. So we built this, uh, secure link launcher. Framework, which, uh, we mentioned in the blog post, which gives you all of the, uh, same levers to pull and buttons to push and everything like that.

So you can send a standard intent, uh, without having to, you know, figure out some custom new system for that. But we basically developed a way to better specify the destinations that you expected that intent to end up in. You know, as we mentioned before. There's sort of like implicit and explicit intents.

So what we do is we basically turn implicit intents into explicit intents, and then we do an additional set of checks for the certificates and everything like that, that the developers want on top of what. The Android system would do before we actually send the intent. So essentially what we do is we have an API that gives developers the same sort of functionality, but ask them questions basically before we allow them to send that intent, um, to another app on the device.

Pascal: Right, and in the intro I said that introducing a new API is not exactly newsworthy per se. That is kind of what all of us engineers do practically every day. But just the big asterisk here is that we're doing it at Meta scale, which makes everything just a little bit more complicated. So why was this not a matter of effectively said, S slash alt intent slash new intent slash g thing, so replace everything. What made this harder than a kind of string based code mod?

Alex: Yeah. Um, I think. I mean, I've got some thoughts on the overall, uh, the, the specific, well, sorry. The overall problem that we're trying to solve with this was spec, uh, this specific framework, uh, Tanu, I know you've, uh, been doing security here at Metal a little longer than I have. Do you want to talk to some of just the challenges in general of this sort work?

Tanu: Yeah, absolutely. As we talked previously that at Meta scale, we have family of apps. It’s not just one Facebook or Instagram app working on its own on the Android device. They are heavily talking to each other. And then, um, we need to differentiate. On, let's say on an Android device, when Facebook is talking to Instagram versus Facebook is talking to any third party app on the device, we need to be careful what data is being shared and in what cases.

It needs to be shared just with the family of apps or with third party app. So, um, the wrappers that we have created is not like one standard wrapper, which can replace everything. It has much more distinction for these specific use cases and different wrappers for different scopes. When anything needs to be communicated between family of apps, then the scope that we use as family versus if a data that should remain within the app itself, within the scope is internal.

So we have like five different scopes created for that purpose. Um, so when developers are trying to use these frameworks. Um, they need to make sure that they use the right scope at that specific place. So that is the human knowledge which is required, which makes us not to just to code mod or find and replace kind of, uh, pattern in the whole code base, which makes it much more trickier.

And that's where, I guess, uh, the artificial intelligence or gen AI makes things much more easier for us today.

Pascal: Yeah, that is now the fascinating part because previously this almost felt like a dead end. Okay. That is still the option of just throwing a hundred people at it. Everybody just looks at every call site individually and makes that determination. But yeah, now we have entered the space where. AI is actually quite good at understanding some degree of context about the code that you've written and that probably unlocked something.

But can we talk a bit about the degree of context first? So how much does it actually need to understand to make the determination about the correct scope to apply for the intent?

Alex: Yeah, I think, uh, the, it, it needs to know sort of like the destination. Package, you know, if we want to like look at this, like very down in the weeds, right? Because like ultimately like the security that we're adding here is a certificate check. And to do that certificate check, we have to have the package name.

Um, but the way that, you know, implicit intense work, we don't necessarily have a package name attached at the call site that's generating the intent. You know, that's not a required parameter for creating an intent. So there's, you know. That issue where in some cases with the way the system worked, originally, the developer wasn't even required to give us a hint that this should go from Facebook to Instagram or Facebook to WhatsApp or something like that.

And in the cases, even where developers might have given us that hint, um, you know, that intent might have been created, you know, 50 methods up in the call stack or something like that. So where the intent is created and where the intent gets sent, like there's a. A loss of context along the way, potentially, where it's very, very difficult to say like, okay, yep.

I know you said, you know, start activity or you know, send broadcast or something right here, but the information that you use for that came from over here and you know, getting a human developer to sort of like make that connection or even using like static and dynamics analysis tools was proven to be very difficult to scale with all of this.

Pascal: Yeah, especially be because you're talking about so many different apps as well. It would probably be a very, you would have to write very specific code for every app to build up this kind of understanding. And we're talking not just one but probably like uh, eight or so that you need to go through that exercise to potential more.

Alex: Yeah, absolutely. I, I would say even more. I think, uh, we recently, for another security project, did an audit and I think we have like, you know, 18 plus apps that are actually published through the Google Play Store in some way, shape, or form. You know, some of those have been developed, some of those are acquisitions and everything like that.

But yeah, like you said, and uh, you know, I think you mentioned the mono repo originally in the, uh, earlier in this, uh, conversation, and we do have a mono repo, but that doesn't necessarily mean that every app uses, you know, the same sort of coding standards or the same sorts of tools and libraries and everything like that.

So, you know, you, you get a lot of, uh, multiplication along a lot of dimensions for the sort of things that you need to be able to solve for, to do the style of migration.

Pascal: Yes, we have a lot of opinionated engineers and some of them express their opinions inside these apps, so they may look a little different depending on which part of the mono repository you are venturing into. Can we talk about the specific workflow? Because now that we're talking about some AI driven code mod, some people might think, oh, cool, so you are letting the AI churn and after a few days it uh, comes up with one big dip and you ship it in the mono repository and everything is done.

That's not how we do things. So can you talk about the experience that engineers have actually interacting with your code mod?

Alex: Yeah. Uh, Tanu.

Tanu: Um, yeah, for, um. Yeah, so specifically actually we had a very good experience. Luckily we almost tried to tackle 1300 call sites for one specific framework, um, using genai, and we were able to successfully migrate more than thousand call sites out of it. Um. Out of those thousand call sites, probably initially when we started rolling out, we were expecting a little bit of friction from the developer side because it takes a little bit of time that when they get the generated code in order for them to review and validate whether the changes make sense. But um. Just because these changes were much more intuitive because of how the frameworks were written. It was kind of easy for developers to, uh, understand the changes and then easily accept, uh, those, the code more changes. So there was less friction on that part. Um, but the problem that we faced was kind of a opposite, where the changes were accepted in much more fast fashion way than we expected.

So, for example, we were generating like tens per day and um, they were accepting it much faster than we expected. And there was a little bit of, um, lag in. The proper validation. That's where the other challenges came in so.

Alex: Absolutely. Yeah. No, I think like, uh, we were surprised by how much the developers trusted the AI to make the correct decision. You know, you can imagine, like, uh, at a company the size of Meta, right? Like you might have a feature that's developed by, you know, a small team of people. That team gets reorged, uh, people get moved around and, you know, maybe the feature eventually gets deprioritized or something like that.

So, you know, uh, there's, and then the. You know, we've got, uh, maybe a different group of people reviewing the diffs than originally, you know, wrote the code or something like that. And, you know, there's just maybe a lack of knowledge that, or maybe just a, a desire to, um, you know, move things along quickly.

So I think it speaks to, you know, sort of a bigger, uh, issue with AI code mods. Um. Which is just how do you validate the correctness of your changes as, uh, Tanu mentioned because we did our best, uh, through our techniques and then we, you know, sort of expected the developers to pick up the rest of the slack there.

But sometimes the developers that were reviewing this didn't necessarily have the correct context to do the review, which was a challenge.

Tanu: Also in order to tackle this problem, we made some changes to our workflow and in the devs, which were generated because earlier we didn't, it's not very clear to the developers that they are AI generated. So we explicitly had these warnings in those generated code that these are. Gen AI generated, so please carefully review in excess if something, you know, if this looks sane and it works.

Pascal: Right, because I guess like in the worst scenario, you potentially scope this too tightly and it's actually no longer working as intended before. Right. It's not really that you make things less secure or anything because the, the default was already suboptimal on that front, but potentially you miss, or the, the AI miss analyzes where a call side was trying to call to and restricted then.

Alex: Yeah, absolutely. And we did see those cases and I think like, you know, that speaks to both the design of the framework and also the design of the code mod where you could imagine like. Um, you know, some of these, uh, scopes that we use to describe like which sets of apps we want to allow this intent to go to are sort of like super sets or subsets of each other.

You know, we could say like, you know, there's the whole family of apps, or there's specific subsets or there's specific apps. Um, you know, in some cases we try to go for the most specific first with the code mod. Which caused, you know, some of our issues. Whereas, you know, if we had just defaulted to sort of that broader scope that still improved security, we might've avoided some of those issues initially.

Pascal: For sure in general, because Tanu, you mentioned validation that you wanted to just kind of stagger a little with more of these code mods rolling out. What kind of validation steps did you have in place to ensure everything worked as expected?

Tanu: Yeah. Um, maybe I can talk a little bit about the whole workflow, which had some of the validations embedded in there. And then, uh, once the code is generated and sent to the developers for the review, then we do extra validations. And these validations are not something that. We did explicitly. It's something which happens originally on even the non-AI code motives. For this, for tackling this specific problem. Um, our first approach was using the non-agent approach, and we used Llama model for that, uh, which was like the 17 billion parameter. And the first one, which was available when we started working on this project, um, our own whole approach was to create like a dynamic prompt for ai. For, and, and in that dynamic prompt, we embedded the set of lines that needs to be changed with our framework and how we identified those call sites is using our lint framework. That's basically standard Android lint, which tells us that, hey, this line has this issue and you should probably not use this and probably move towards using, um, this framework, which is more secure. So we leverage the link system to identify the exact call sites. Basically all the list of files and the list of all the lines where the change needs to be made. We gave, uh, we created a dynamic prompt and embedded all those lines within the prompt itself. And then we gave that dynamic prompt to. Lama, um, and then Lama evaluated what changes needs to be made, and it modified that code and created the new, uh, new modified. Changes and gave us back the new code along with the set of other actions that needs to be taken on it. And the other actions were mainly like adding some imports for the frameworks or adding the dependencies just because it. Wasn't an agentic system at that time. It couldn't do like x, Y, Z things. We were given it one specific task to do. And it was doing that specific job of modifying that specific lines with the new, um, with the new framework. And then our script would kick in, which would do the imports and dependencies, inclusion, and, and then all the other validations would kick in, which is whether formatting is correct or not. Whether build is succeeding or not. Whether the length issue still exists because after we fixed, are there like other land issues which appeared or everything looks sane? Um. Are there other validation failures or other other problems with surface because of the specific change? So all those validations were done on a recursive basis.

So it is possible that after first pass, the bill might fail because of some X, Y, Z reason. So. We create like a new dynamic prompt, injecting the new error code or the issue and give it back to lama. And then it would do it like we do it five times just to make sure we are not in a recursive loop. And hopefully by the end of it we get the right code mode, which we can push to the developers. So this is all the validation that we do beforehand. Um. But then once diff is created, there are all the CI validations that also kicks in, whether the tests are passing, uh, sometimes when there are like light tests, which, um, which basically test corresponding to the changes that we have made, but it might not test for the whole build. Um, and sometimes they also get affected. So yeah, so there are like all the validations in the whole workflow, which are embedded at the various stages of the code mod generation.

Pascal: Just a really quick question. When did you actually start this particular project?

Alex: It, it was in the beginning of 2025.

Pascal: I think it's actually really important to mention this because a lot of the stuff you mentioned seems almost archaic where we are today, at the beginning of 2026, this whole idea of like, well, we couldn't possibly ask an AI model to actually edit the code, but we need to just. Give it all the lines, take them back, apply them back to the code base, and run the whole thing again.

But it, it is just fascinating how, how quickly the environment has changed. You mentioned specifically like we weren't in energetic world, and that that becomes so clear based on your example. Um, yeah. These, these days we have a, it is a lot easier to write these kind of ibase code mods where you can just basically say.

Hey, here's the file. Fix this problem. And then a lot of these validation steps or the infrastructure that you had to build is basically now, um, that the work is done for you effectively.

Yeah, that's what we are seeing even today and. I think our future project seems, uh, much more simpler for the use cases that we tackled earlier. Now we are moving on to much more complex problems, which can't be done using, you know, non-agent world. They actually require to be looking at the whole code base, doing multiple reach and rights and making decisions facing on the basis of like multiple things.

Alex: Yeah, absolutely. And I think like it's, uh, interesting that you mentioned like, you know, this does look a lot like what we get for free. What's, you know, table stakes these days with agentic AI and everything like that, which I think has been, you know, fun to see, you know, the industry go towards that direction because it's kind of a validation of the approach that we started off with, you know, before a lot of people had, you know, started doing these things at the sort of scale that we were trying to do them.

Pascal: Yeah, this, this I think will be an interesting time capsule because the time at this particular approach was what you would do is probably like within like. Three to six months. It's like the models were either not good enough to even attempt to do something like this, or you had the full blown agen suites and a lot of the additional work that you did around it is basically something you get for free now.

But you were clearly trailblazers because I feel like you proving that this model works even with a lack of all the tools that we have now just showed what is possible now suddenly, whereas previously, if you have. 1,300 call sites. I want to manually migrate them. This is probably something you won't even start.

Alex: Right. Yeah, absolutely. And I think like it has definitely influenced, especially in the security org, you know, the types of projects that we're tackling with ai. Um, you know, I think it's given everyone a lot of, uh, you know, more comfort that like, this is a viable approach. This can have a high enough success rate that, you know, is worth investing in.

It's not going to severely annoy developers if we go and, you know, produce high quality diffs, you know, by AI or. Anything like that. But I think it's also, you know, kind of one of the things that showed us where we need to invest most is in the validation side of things. Uh, for two reasons. Number one is it, you know, improves developer trust because they know that there's a legitimate test plan that, you know, has some sort of reproducibility behind it where they can say like, okay, if I want validate this, like I know that I can.

But it also helps us, um, you know, in the cases where previously we might have trusted a developer to do the validation for us, where we can say, well, this failed our actual, like, automated validation or our, you know, gold standard validation. So we can avoid publishing those diffs in the first place, kick 'em off, maybe did another queue for a security engineer or someone to review, and then maybe make a decision on that particular call site manually.

But still, I mean, if we're going from, you know. A hundred percent of diffs authored and reviewed by folks in security to fix security issues to 1% of diffs, you know, well, a hundred percent AI authorship, 1% reviewed, and then, you know, all of them, you know, maybe, um, accepted by humans. Uh, outside of, you know, the security org. We've massively improved our ability to scale. We've also very much focused down on the thing that we need to invest time in. We can let the industry tools get better at actually like reasoning about code and making the modifications and everything, and we can invest our very limited resources in the validations for the specific types of problems that we care about, so that we can kind of have a best of both worlds from scaling from that standpoint.

Pascal: Yes. That's very nicely put, and this is certainly a problem that still applies today, that AI makes mistakes or will not figure out. How to solve a certain call site. In your specific example, so you mentioned that there will be certain places where you either don't publish the diff that is under construction or might just kind of abandon it.

What's your plan for, for all of this? Is it basically then this 1%? You will just basically take it on and try to solve this within the product security group?

Alex: I think that it's a very good question and I think like there's a couple of different ways to look at that. Uh, the first one is, you know how. Done. Do you have to be before you move on to solving the next problem? Right? Like you could imagine that there's always gonna be a certain number of call sites in the code base where we don't have perfect security controls applied.

Like we have reasonable security controls applied, but maybe not perfect security controls applied. Um, so like how much effort is. Valuable to put in to solve for some of these long tail things that are hard for the AI and hard for humans. So one thing that we're investing in is sort of like attack surface identification, where we would say like, okay, you know, there are certain entry points to the app where we know that attackers can.

Uh, take control of things potentially if they can get a malicious payload in there, or if they can cause us to send a payload from one app to another or something like that. But basically, like these are relatively well defined, especially in the mobile space. So what we're trying to do is, you know, both get better at the AI code mods and code mods in general and everything like that, so that we're more successful, but we're also trying to target our efforts better.

Actually, we're using AI to try to generate some of that targeting data as well, which I'm sure could be a whole different, uh, conversation. But I think the answer to that is in a lot of cases, I think like we would try to prioritize those call sites where, you know, we might be having failures based off of some other like external metrics or something like that to decide what we want to do in that particular case.

Pascal: Yeah. Zooming out a little from this specific example of the intense system. How do you figure out what the AI is currently good at with all the attack surface areas that exist out there in mobile apps or elsewhere? How do you figure out like this is a good project to pursue now because the tools are not suddenly good enough to actually deal with this kind gnarly problem that previously we couldn't address?

Tanu: So basically, uh, I think. AI is evolving at a very, very fast pace. So what might be true today might not be true tomorrow, or the capabilities are increasing every day. So I think, uh, yeah, I think if we don't try then we don't know whether something can work or not. And that's the strategy I guess we are taking specifically. And, um, one thing. For sure. That I think is a good approach is maybe having some kind of sprint system where you take a quick project, you apply AI for fixing some of the issues and see if. That works for small problems or not. And then you decide whether you can scale, um, using gen AI for adopting it at scale or create the whole system, which would work in a process manner. So I think, uh, in, in my opinion, uh, you have to try it out first, then we'll, you know.

Alex: A hundred percent. Yeah, I'd agree with that. And the, the analogy I've heard internally is, you know, letting a thousand flowers bloom, right. Of, you know, just trying a bunch of different things, seeing what works. Uh, and you know, going forward with ones that work, I think like. The great thing about AI is that it really lowers the barrier.

To entry for trying things because instead of having to figure out all of the boilerplate of all of the systems that you have to glue together to try to do something at scale, to identify the call sites, to do the migrations, to do the validations and everything like that, you know, sometimes it could be as simple as like.

Two or three sentence prompt that gets you, you know, 60% of the way to showing that, you know, you can make some impact with this. And then, you know, I think the wonderful thing about AI is that if you do it correctly, you can still capture that 60% of the impact, you know, locally. Publish that as a set of diffs and say like, well, I spent another two, three hours or two, three weeks on this. Didn't make much more progress. Well, cool. You still made a lot of impact by even trying, because you still had some concrete output for this. So I think like that's the best part about AI is yeah. It's not perfect. It's not a hundred percent accurate or anything like that, but it's many, many times faster for iteration speed for these small ideas and figuring out which ones need to turn into larger ideas.

Like Tanu said, um, you know, we've tried to combine this sort of with a, a sprint system of saying like, okay, not only do we have the tools to experiment, we actually have leadership support for actually running some of these smaller scale experiments and doing those in a little bit more of a structured way. Um, but all of that I think is ultimately enabled by sort of the low cost rapid iteration that you get with these AI agents and AI tools in general.

Tanu: Also, um, I would add a little bit to that. I know we talked about creating these diffs and sending them to developers in a hundred percent, uh, correct. And perfectly building and, you know, working manner. But what we also experienced was. The process of not giving anything to developer was very, very slow.

So if we had to like, uh, code more thousand call sites without anything, that would have taken us years. But for our use case specifically, it, it kind of worked fine that we had perfectly valid, validated, uh, generated code. But in cases where it's not perfect, I guess that's still useful. Even if the generated code works 90% and then developers have to, uh, take it and then make some modifications that would still improve the efficiency and save. You know, so much time, even for developers, um, even for our experiment, those thousand devs, which we created, not all of them were, you know, one click push. In some of the cases, I guess it was 60 40 split.

Where 60% was a hundred percent correct. They just clicked on a button and then it was pushed to production. While in other cases they had to meet certain tweaks and then either the command or validated explicitly or did X,Y, Z things on top of it. So I think from the efficiency sake, we don't have to be a hundred percent correct. Like AI, even if it generates some percentage of code, it's still better than having nothing.

Pascal: Yeah, having a starting point is still better than just a lint that just sits there. Um, and you need to hover over it and tells you, Hey, you could do better here. But having a starting diff that doesn't quite compile is still so much easier to jump off of.

Alex: Absolutely. Yeah. And I think that, you know, that, that speaks to other things as well, right? Is, you know, AI can solve, uh, for part of this, it can solve for, you know, generating that initial diff but it still requires a lot of human involvement and there's a lot of value for human engineers and human people in an AI driven process to figure out how to prioritize things, to communicate developers, you know.

And the process and everything like that, that we're trying to solve for with all of this. So, you know, even if AI is doing a lot of the code generation work, I think that there is, you know, still a strong future for software engineers as part of the process. Maybe not as, you know, people that write code 100% of the time, but that still supervise the overall process and, you know, make sure that.

The big picture is getting accomplished with all of this. So I think like it, it gives us an opportunity to focus on higher level tasks and objectives as opposed to having to write all of this code and shepherd it through to landing in a lot of cases.

Pascal: Absolutely. And just to echo something both of you said, to some degree, I also feel like the whole, vibe for the lack of a better term in the company has really shifted a little because there is so much more prototyping and experimentation that you see everywhere because code has kind of become cheap these days.

You can just try something on the side and this can be something like code mode, where like, ah, I've been wanting to use this more accessible component. On this kind of surface here, but I also had like 500 cases and never wanted to go through them manually. Now maybe you just start a little job on the side and see what it does, and even if it just does 60%, my God, now 60% are more accessible than they were before.

That is massive. Or you just build a new tool on this side. So there is something really exciting, I feel like going on for US software engineers at this point. And. That is something to look forward to. And now just before we wrap things up, is there something concrete, something that you both are looking into right now with these new found skills that you're investigating or actively working on?

Alex: Yeah, uh, absolutely. I know, uh, you know, from my standpoint, uh, you know, one of the things that I've continued to work on, um, is trying to figure out like what we would call code modernization. As a topic area would look like from a security standpoint. So, uh, trying to apply some of the learnings that we've had here and elsewhere and, you know, building out sort of like a reusable pipeline where we can plug in some custom validations and plugins, some custom prompts and everything like that.

And, uh, you know, allow developers to focus on the things that they care about, which is identifying the actual issues that they wanna solve. And then we'll sort of figure out the best way to do this and publish this at scale and everything like that. Uh, so for me. Personally, um, you know, that's been a lot of like CNC plus plus code, um, that we've been looking at recently, but I know there's a ton of other efforts that, uh, I believe Tanu and others are involved on, uh, that are a little more adjacent to, uh, what we're talking about in this podcast.

Tanu: Yeah. Some of the projects, um, they kind of align with what Alex said, basically automating the whole pipeline. Um, but specifically because we are talking about frameworks, one of the North Star goal that we have is evaluating how AI performs for. The automation of whole framework system, whether AI can write frameworks, AI can prevent bleeding of the issues in the code base automatically, um, for the existing framework.

And AI help in building a pipeline which can easily, um, make people adopt these frameworks at thousands of call sites. Can we automate like the whole system somehow? So we are now in the process of, uh, working on the pieces of. This whole pipeline and learn more about the capabilities of gen AI on what things it can help us automate versus where with specific pieces, um, are where the human involvement might be needed.

And we are not there yet. So we are still in the evaluation of the whole pipeline and the automation, and that's, that's a whole North Star goal for us for this year.

Alex: And I, I love, I just want to double tap on one thing that, uh, Tanu said there, which was, you know, thinking about AI in the context of frameworks and whether or not, you know, the AI can sort of do the development of the framework and the deployment of the framework and everything like that. One thing that we're also looking at on our side is, you know, whether or not.

In an AI world, we would develop as many frameworks as we have in the past because you could imagine that, you know, frameworks are generally made, or at least libraries. I mean, I think that, you know, there is a difference, but the terminology gets used a little bit interchangeably inside the company. But you know, we could imagine like eight. A piece of code a library makes it easier for humans to reason about, you know, the functionality at a particular call site. Um, and it makes it easier to maybe do the, the actual code modification and everything like that. But frameworks and libraries can tend to be fairly rigid. So what we've found is that, you know, over time, like, you know, library that started with, you know, three functions to cover, you know, five different features or something like that suddenly explodes into 30 functions to cover 300 different features or something like that. So, you know, if we have AI that's already able to, you know, um, you know, reason about adopting that framework, well why can't it just write, you know, maybe some slightly more bespoke code for, you know, some of these call sites instead of having to build a framework that we then have to maintain and try to migrate and everything like that too.

Pascal: I am certainly looking forward to having more secure by default code because one of the most nerve-wracking experiences is when somebody from the prosec on call comments on your diff, and that has happened to me a few times, but we are sadly out of time. But I can only thank you for ensuring that you keep us and our users worldwide, safe and secure using our apps.

And of course, for joining me here on Meta Tech Podcast.

Alex: Absolutely. It's been a lot of fun. Appreciate the opportunity.

RELATED JOBS

Show me related jobs.

See similar job postings that fit your skills and career goals.

See all jobs