Something Digital - magento

MySQL Best Practices for Upgrading to Magento 2.3

If you’re a member of the Magento community, you’ve probably heard that Magento has declared end of life for 2.1.x support to be June 2019 (that’s now!). As Magento 2.1 approaches end of life, we at SD have started the processing of upgrading a number of our clients from 2.1 to 2.3, the latest major version of Magento 2.

Because of the number of core data changes that come along with the upgrade from 2.1 to 2.2 alone (i.e. the move from serialized strings to JSON), we ran into a few issues during the initial process of deploying a 2.3 instance to a staging environment. The most challenging issue came in the form of a MySQL error during the setup:upgrade process. If you’re unfamiliar with this particular Magento command, it’s used to perform changes to the database that the application code will require. Essentially, modules can be created to make changes to preexisting database tables, or even create new ones for their own use. Because of the conversion of serialized data to JSON in many of Magento’s core tables, including customer and sales data, this process can take a very, very long time to execute depending on the amount of data in your database.

Just to take a step back for a second, I was recently fortunate enough to have the opportunity to attend on online Oracle course for MySQL Performance Tuning. At the very core of understanding how to tune your MySQL instance’s performance lies the requirement that you must establish a consistent baseline of which to work off of. Along with setting this baseline, comes the responsibility of making sure that test environments have a similar amount of data and load on them to accurately test any updates. Clearly for us, it is of the utmost importance to have near identical replication of production data in a staging environment for us to test (and benchmark) the process of converting these loads of data for these upgrades. For a number of our clients, this means making sure that we have enough anonymized customer and sales data to give us an accurate representation of how long these deployments will take, or if we’ll run into issues in a production environment.

And run into issues we did! Aside from the usual errors one would expect to run into caused by corrupted serialized data, or incompatible 3rd party modules, we ran into a rather bizarre (see: misleading) MySQL error during that setup:upgrade process on one of our Community Edition clients with a large amount of sales data. See the error below:

Warning: Error while sending QUERY packet. PID=14322 in /vendor/magento/zendframework1/library/Zend/Db/Statement/Pdo.php on line 228

And that’s all folks! No other exceptions, errors, warnings, clues. Nada. Digging into the code a little further, Magento was attempting to query the database, but when the MySQL instance unexpectedly threw an error during execution of the query, the application code wasn’t handling it. So, all we were left with was this somewhat vague error message from MySQL.

If you’re unfamiliar with this error, a quick Google search will point you in the direction of increasing the max_allowed_packets variable on your MySQL instance. Which is what we did… again and again until we maxed out the variable at 1G. So, what gives?

As it turns out, this particular staging instance’s MySQL instance had been greatly modified. Many variables were not only changed from the default, but drastically different than what was configured on production. Which might make sense right? Production experiences heavier loads than staging, duh. But if we go back to proper performance tuning and the goals we mentioned above, we should have remembered to not just replicate the data in staging, but the MySQL instance itself.

One system variable in particular had been significantly decreased in staging, and that was the wait_timeout variable. As defined in the official MySQL Documentation this variable defines “the number of seconds the server waits for activity on a non-interactive connection before closing it.”. We increased this variable as necessary, re-ran the deployment, and now everything worked– the deployment, specifically setup:upgrade, was successful. So again, what gives?

We came to understand that one of the queries – the query referenced in the error above – was actually timing out because it was waiting on another very large query to finish. We were able to replicate this behavior in a local environment when reducing this wait_timeout variable, running two competing queries at the same time, and constructing one of these to be very large and time consuming.

There is no way that we would have run into this issue prior to a production deployment had we not made an attempt at replicating the live MySQL instance and its data in a test environment. Overall, I think our team learned a valuable lesson about both the size of upgrading from 2.1 to 2.3, and the importance of data replication in a test environment.

What I Learned Becoming a New Relic Certified Performance Pro

I recently became New Relic Certified Performance Pro, and if you are not familiar with New Relic just yet, I would highly encourage you to check it out. When it comes to troubleshooting issues with your application or even just looking for a way to tune up its performance, New Relic can be an invaluable resource to have. There are several types of monitoring that New Relic offers, but for the purpose of this piece and the focus of the certification exam. and we will concentrate on New Relic APM (more on this later).

The certification itself is relatively straightforward. New Relic suggests preparing for the certification by running through just one of their free online courses. However, an additional webinar exists with the sole purpose of preparing you to take the certification exam.  The webinar is quite useful for getting a sense of what questions you will come across. To pass, you will need to score at least an 80%, but considering you will have seen a majority of the questions in the course’s practice tests and webinar, this should not be too difficult for most. Some important aspects of New Relic that you will see on the exam (and will definitely need to understand in order to use New Relic effectively) are: Application Performance Monitoring (APM) at large, Transactions, Error Analytics, Alerts, and Apdex.

First, it is important to understand APM. While New Relic offers all kinds of monitoring, APM is really the core of New Relic. You will need to understand that it truly does monitor the performance of the application, and neither the performance of the hardware the application is running on, nor direct user experience (though an important argument could and should be made that the performance of your application will have a significant impact on user experience).

Transactions are one of my favorite aspects of New Relic. Transactions really refer to an event that took place in a specific part of your application. They become especially useful when trying to track down a part of your application that is taking up a lot of time and/or resources. Transactions can cover anything from a request for an external web resource to slow database queries, though exact queries will likely be obfuscated by default. Understanding how you can effectively drill into the transactions to track down a slow or buggy method in your application is incredibly important.

New Relic’s Error Analytics is another nice feature that is especially useful for fires or other critical issues. Error Analytics combines the powerful granularity of monitoring transactions with the stack traces from reported errors that you may normally have to dig through logs to find. With New Relic’s Error Analytics you have the ability to monitor errors of various levels of criticality and get a sense of their frequency. As previously mentioned, Error Analytics is integrated with transactions such that you have the ability to drill down and see what part of your application is responsible.

As with most aspects of New Relic, alerts can be created with a great deal of granularity and can be especially helpful if you are looking to monitor a specific aspect of your application or simply stay up to date with any major outages. Alerts are a big-ticket item for us at SD, as they allow us to hop on a client’s emergency before they even recognize that something is wrong. One particularly useful feature of Alerts is the ability to assign different groups to notify in the event of a disruption or degradation of service. In most cases our clients prefer to be in the loop when an Alert for an outage comes in, however there may be cases where an Alert is critical for one party and more or less noise for another. Setting up proper notification channels are just one part of the granularity of Alerts. When it comes to taking the certification test, you will want to make sure that you also understand the conditions that are involved with creating Alerts as well as how to set up Alerts for what New Relic calls “key transactions.”

Apdex refers to a generated score that your application receives and represents a numeric value that can be assigned to the health and/or performance of your application. Essentially there are 3 buckets that a request to your application can fall into: Satisfied, Tolerating, and Frustrated. While the exact threshold (T value) that determines which bucket a request falls into is up for you to decide, there is a very specific formula that is used to calculate your application’s Apdex Score. Understanding how your Apdex score is calculated is critical to determine your T value, as this threshold is quite subjective and vary from application to application. If you are considering taking the certification test, make sure you understand how to calculate an application’s Apdex.

Overall, New Relic is an invaluable resource for developers looking to monitor their applications health and/or performance. The certification test is well worth your time if you are looking to get a better understanding of how to get the APM tools to work for you and get the most out of them. If you are looking for additional resources regarding APM, New Relic’s APM documentation is always a great starting place. And do not forget that there is so much more that New Relic can do for you; APM is just the tip of the performance monitoring iceberg!