Watch an interesting case study how AppsFlyer moved from Bitbucket to GitLab
Transcription:
Hello, my name is Elad Leev.
I’m a platform engineer from AppsFlyer and I’ve been there for the last two and a half years
I love distributed systems and databases and I’m doing it for a while.
I like to call myself an RSS junky because I always open my morning reading my RSS feed.
In today’s talk, we are going to see some of the unique challenges that AppsFlyer is facing.
We are going to see what was the motivation to make this kind of move.
What was involved in the migration process itself.
What was the architecture that was chosen.
In the end, we are going to see a small retrospective about the two years of using GitLab, because actually, this migration happened two years ago.
About AppsFlyer
So what is AppsFlyer?
It’s s a mobile attribution in an analytics platform. we basically help app marketers to get a better decision on their running campaign, using the data for that we measure for them. If you have a mobile device you probably have Apps flyer SDK installed by one of our clients.
We are currently installed on 90% of the devices in the world, and we are trusted by the world’s best companies like Slack, HBO, Alibaba, and Walmart.
This graph can represent our incoming traffic growth but can also represent the number of engineers that we hire, the number of micro-services that we have and so more.
We are facing with more than 1 million incoming HTTP requests a second, which sum up into 90 billion events per day, we have a crazy amount of projects and micro-services in our system, and we have more than 300 developers in the RnD organization.
So, what did we have before GitLab?
Due to historical reasons we didn’t start with GitLab or GitHub, we actually started with Bitbucket and we used a hosted solution. Our founders decided to go with Mercurial because it was just much easier to go faster with it because they thought that the command line that mercurial is providing is much simpler than the git one.
So why did we decided to move?
First, we didn’t want to stay in a hosted solution, as our company grows the demands of our clients grew as well, we couldn’t put ourselves in the risk of opening one of our repositories to the outside world. It happened to us more than once with minor repositories but it was just too easy to do it on a hosted solution, so we want a solution that will sit inside our VPC.
Latency
From time to time we faced latency from the Bitbucket service. some of them did happen because of our network configuration, but it started to cause our builds to fail and it’s something that we cannot live with.
Limitation
Bitbucket is limiting you to 1,000 calls per hour, and it was really easy to pass it.
So why did we try?
We did try to create to use a self-hosted Bitbucket solution, but it was just a “closed-source”, you couldn’t really know what is going on inside of it. if you are facing a bug you didn’t have the ability to know if it’s something that happened because of your configuration or it’s something that is wrong with the product itself. It just felt like a black box.
We considered off using Github Enterprise Edition because everybody basically knows how to use GitHub, but it was really really costly and it gave us minimum ROI.
So why GitLab?
We decided to go with GitLab because of a few reasons. the first one is the GitLab growth. GitLab has shown great growth and maturity over the years.
That factor became the most DevOps -friendly product out there and got adopted by more and more companies. We really appreciate the transparency approach that GitLab is taking and everything in GitLab is basically public as default. You can see the issue tracker, you can see the code itself, and many many more. I highly recommend reading the pager duty onboarding notebook that GitLab has, it’s both educating and you can also learn about a lot about the GitLab organization itself.
To sum it up GitLab was the best fit for us.
The migration process
During the migration process, we ask ourselves some key questions. first – API support. There’s a lot of things that happen during the builds on our services and we wanted to make sure that the new GitLab API really contains all the endpoints that we need to make those decisions.
Architecture. We are a growing startup and we are in a hyper-growth situation. How we create an architecture that will scale up Ten times from today.
Education. Can you change your developer’s main tool? tooling and integration. Do we have sufficient tooling to make this move seamless? So let’s dive in.
During the migration, we had a few things in mind. We have to save history commits and tags, and we need to do it in the fastest and efficient way possible.
After a short research, we found a tool which is called ‘Fast export’ you can see the URL down there if you want to use it. This tool is basically taking a Mercurial repository and convert it into a git one. We started with a few team’s repositories and we saw that it’s working well, but now how do we scale it? how do we move our entire repositories to GitLab?
We decided to create a tool, a tool that will help us to do this move by providing a self-service tool to our RND organization. The tool was a one-liner, so it was just easy to use it. It was Idiot-proof so no one could make a mistake. It keeps everyone in sync using a designated slack Channel as you can see in the picture and lastly, it was really safe. You cannot override someone else’s repository by mistake.
The basic flow of the tool was like this:
First, we check if the repository already exists in GitLab; then we notified the right team using a Slack Channel we close the old repository to write in Bitbucket and we create the repository in GitLab.
We are converting the repository for Mercurial to Git and then we push it to GitLab, and at the end we’ll log it into the channel that I showed before.
I want to focus on one thing here, this one – it’s really really important to close the old repository to writes in Bitbucket service because it happened to us more than once that developer used this tool to migrate his repository from Bitbucket to GitLab but other developer didn’t know that the repository had been moved, so you don’t want to get into this split burn scenario and I suggest you always close the old repository in Bitbucket so everyone will push to the right place.
During the immigration, we got some additional benefits from it. first, we had the ability to do a small cleanup. We found out a lot of death services and a lot of the repository that no one is using so we had a chance to download them and just upload it to S3 to store it for the future.
And second, everyone basically takes part in the process, we built our developer trust in the platform group.
To sum this part up let’s see what helped us:
First that we create a self-serve tool we couldn’t do it alone and we did it with the help of our RND organization. Second, transparency. Everyone who saw the process as it goes. We built our developer trust we try to make sure that everyone is aware of the benefits of moving to GitLab and in the end we set deadlines. I know it’s not the nicest thing to do, but we had to put a deadline for the migration.
Education
As I said earlier a big part of the migration was education. We tried to make sure that everything is covered and everything is well documented. At AppsFlyer we’re using guru to document internal things in the organization and as you can see we created several documents, for example, we created the document for issues that could come up from the immigration in Jenkins, we created one with basic configuration and so on.
AppsFlyer is giving all the developers free access to Pluralsight so we found out two good courses and we send it to our developers in order to strengthen their knowledge. Another resource that I will highly recommend is this one which can be used as a cheat sheet pun intended it’s just a website that has like a lot of edge cases that could happen using the Git command line, and how to solve it.
Now we are moving to the interesting part – We asked ourselves how many application servers that we need should we use EFS shall we use NFS? how can we make it highly available, what is the best solution for DR, and lastly how we how do we backup GitLab.
This is the architecture that we came out with:
As you can see a developer is basically reaching to route 53 address which has console service behind of it, this console service is built from GitLab instances we use the managed service for an elastic search for the fuzzy search, we use RDS as our database and we use our own Redis for caching with the ready Sentinel, and I want to focus on one thing which is this one the EFS. At first, we thought that we can save our souls from using the old rusty and NFS service and just use EFS we thought that EFS did a long run since edible you just launched the service and we thought that it will fit for us.
Apparently we were wrong because EFS is not handling well with tiny files like Git is creating, so at first, it worked really well but after a major part of our developers moved to GitLab we saw that things start to break, and as always it happened on Friday night. so eventually we did use NFS with NFS replication and I want to focus on one more cool thing that we did, we have a daily restore actually hourly restore to S3 with replication but we also have a daily restore to a Google cloud platform, they both allow us to actually test our backups but also we have a separate GitLab instance waiting on another cloud provider which really help us to be cloud-agnostic.
You all know the phrase if a tree falls in a forest and no one around to hear it does it make a sound? so my version of it is – if a backup is taken without anyone who was thought tested does the count as a backup? so always test your backups.
Self-serving tooling. Like I said before the first tool that we created was the migration tool but we also created a lot of other tools in order to support this migration. We created a small one-liner that we actually built based on a request from one of our teams. They wanted the ability to create, to connect their repository with slack. So we created a one-liner that will do it really simple and as I said, we set deadlines for the migration in order to keep things in track we created another scheduled task that basically took all the projects from bitbucket and all the projects from GitLab and checks which repository migrated to GitLab. After the process finish and they had the list of the repositories that they didn’t move yet to GitLab we post it on the relevant team channel as you can see for example,” hey data team, you didn’t move those services and you need to do it as fast as possible” so basically we created some kind of shame list.
Another cool thing that we did is to create an in-house GitLab API wrapper. during the migration, we saw that we have a lot of code duplication and our services. in AppsFlyer platform group we write code in Ruby, in Python, in Go, in Bash and so, on and we saw that we have a lot of these duplications we saw that a lot of people just implement their way to, for example, to authenticate a vault to manipulate the GitLab API paging and to filter the payload and so on, so we decided to create a central paste that will contain all the related metadata that we need.
We decided to create a wrapper, and not that kind of wrapper, this kind of wrapper would really the central place that will contain all the GitLab metadata as you can see all our services and builds are talking to this to this service, and it’s updated by using GitLab system hooks which is really a really cool component of GitLab, and we use Redis as a cache layer we also have an internal scheduler that detects data integrity. you can all see the system hooks that the GitLab can post, you can find it in the documentation you can do like crazy stuff with it and it’s really easy to debug it using the UI itself, it’s really amazing, and if you want to read the full technical blog about what did we do and how did we do it, you can visit our blog post.
And lastly, we have the ability to see if everything that we did really was good for our growth. Today at AppsFlyer we are using GitLab for many many things, we actually a lot of teams ditch other products like JIRA and started to manage their tasks inside GitLab which is really amazing,
But we did face two bugs that were I want to show you today:
The first one is this bug which is already actually still exist, it’s still open, and one of them was because of the restores that we are doing to GCP. We found out that while we are making those restores GitLab just takes the repository directory and move it aside and call it a repository dot all dot some kind of timestamp, as you can guess, after a certain time the disk space becomes not available. We solved it by creating a small script that will clean it but eventually this issue is still open.
Another one is this one which already fixed by GitLab. We found out that while taking a backup some of the developers continue to push code to the repository that just been backed-up, so we’ve we faced a lot of error like file changes we get it so GitLab solved it by announcing that a GitLab backup strategy so we just used the strategy copy and it solves the problem.
And that’s it thank you all and you can find me here if you have any questions, thank you.
Related Links:
- Watch: the fully detailed story was first published in our meetup and blog
- How we can help you migrate from Bitbucket to GitLab
- Bitbucket website