Improve mysql performance on large data sets

The world of relational database theory is a world dominated by tables and sets, and operations on tables and sets.A database is a set of tables, and a table is a set of rows and columns. When you issue a SELECT statement to retrieve rows from a table, you get back another set of rows and columns that is, another table and it is called abstract notations.

MYSQL performance


Database management systems implement abstract concepts but do so on real hardware bound by real physical constraints. As a result, queries take time—sometimes an annoyingly long time.And we, being impatient creatures, don’t like to wait, so we leave the abstract world of instantaneous mathematical operations on sets and look for ways to speed up our queries. Fortunately, there are several techniques for doing so:

  • Create indexes on tables to enable the database server to look up rows more quickly.
  • Consider how to write queries to take advantage of those indexes to the fullest extent, and use the EXPLAIN statement to check whether the MySQL server really is doing so.
  • Write queries to affect the server’s scheduling mechanism so that queries arriving from multiple clients cooperate better.
  • Tune the server’s configurable operating parameters to get it to perform more efficiently.
  • Analyze what’s going on with the underlying hardware and how to work around its physical constraints to improve performance.

Indexing : There are various techniques to speed up the query but indexing is the basic mechanism to speed up your query. If you are doing proper index of your column so you can easily see the real change in execution time. It’s often true that when a query runs slowly, adding indexes solves the problem immediately.But it doesn’t always work like that, because optimization isn’t always simple.Nevertheless, if you don’t use indexes, in many cases you’re just wasting your time trying to improve performance.So always first you can using indexing first then other technique will be helpful.


Benefits of Indexing :

Let’s consider how an index works, beginning with a table that has no indexes.An unindexed table is simply an unordered collection of rows. When addition of an index on the coloumn in table the index contain an entry for each row in that table.

lets suppose we have a table name student.

id roll_no class
1  11882   10
2  22332   11
3  34432   12
4  23343   9

select roll_no from student where class = 11;

Now, instead of searching through the table row by row looking for items that match, we can use the index.  Suppose we’re looking for all rows for class 10 We begin scanning the index and find values for that class. Then we reach the index value for class 11, which is higher than the one we’re looking for.  Index values are sorted, so when we read the index row containing 11, we know we won’t find any more matches and can quit looking. so where the matching end skip the rest.That way, we can quickly position to the first matching value and save a lot of time in the search.

The MySQL Query Optimizer : 

The MySQL query optimizer takes advantage of indexes, of course, but it also uses other information. For example, if you issue the following query, MySQL will execute it very quickly, no matter how large the table is:

SELECT * FROM tbl_name WHERE FALSE; 

MySQL looks at the WHERE clause, realizes that no rows can possibly satisfy the query, and doesn’t even bother to search the table.

EXPLAIN SELECT * FROM tbl_name WHERE FALSE

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE NULL NULL NULL NULL NULL NULL NULL NULL NULL Impossible WHERE

Normally, EXPLAIN returns more information than that, including values more informative than NULL about the indexes that will be used to scan tables, the types of joins that will be used, and estimates of the number of rows that will need to be examined from each table.

EXPLAIN is to find out whether adding indexes would help the optimizer execute a statement more efficiently.For this example, I will use just two tables that initially are unindexed.This suffices to show the effect of creating indexes. The same principles apply to more complex joins that involve many tables. Suppose that we have two tables t1 and t2 , each with 1,000 rows containing the values 1 to 1000.The query that we’ll examine looks for those rows where corresponding values from the two tables are the same:


SELECT t1.i1, t2.i2 FROM t1 INNER JOIN t2
-> WHERE t1.i1 = t2.i2;

EXPLAIN SELECT t1.i1, t2.i2 FROM t1 INNER JOIN t2
-> WHERE t1.i1 = t2.i2

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: t2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where

Here, ALL in the type column indicates a full table scan that examines all rows. NULL in the possible_keys column indicates that no candidate indexes were found for speeding up the query.(The key , key_len , and ref columns all are NULL as well due to the lack of a suitable index.).Using where indicates that information in the WHERE clause is used to identify qualifying rows.

To make this query more efficient, add an index on one of the joined columns and try the EXPLAIN statement again:

mysql> ALTER TABLE t2 ADD INDEX (i2);
mysql> EXPLAIN SELECT t1.i1, t2.i2 FROM t1 INNER JOIN t2
-> WHERE t1.i1 = t2.i2;

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: t2
type: ref
possible_keys: i2
key: i2
key_len: 5
ref: sampdb.t1.i1
rows: 10
Extra: Using where; Using index


  • This is an improvement.The output for t1 is unchanged (indicating that a full scan still will be done on the table), but the optimizer can process t2 differently:
  • type has changed from ALL to ref , meaning t1 ) can be used to perform an index lookup that a reference value (the value from to locate qualifying rows in t2 .
  • The reference value is given in the ref field: sampdb.t1.i1 .


Choosing Right Data Types for Efficient Queries:

Use numeric rather than string operations. Calculations involving numbers generally are faster than those involving strings.

If a string column has a limited number of values, you can use an ENUM or SET type to get the advantages of numeric operations.These types are represented internally as numbers and can be processed more efficiently

choose the smallest type that will hold the required range of values. Don’t use BIGINT if MEDIUMINT will do. Don’t use DOUBLE if you need only FLOAT precision. If you are using fixed-length CHAR columns, don’t make them unnecessarily long. If the longest value you store in a column is 40 characters long, don’t define it as CHAR(255) ; define it as CHAR(40) .

For MyISAM tables, use fixed-length columns rather than variable length columns. For example, make all character columns CHAR rather than VARCHAR .The tradeoff is that your table will use more space, but if you can afford the extra space, fixed length rows can be processed more quickly than variable-length rows.

These are some technique to speed up your mysql performance . There are many other ways to speed up the mysql performance but these above points are basics and must be used . I hope you like the post please comment and share.





No comments:

Post a Comment

Download the eBook of react js for beginners

ReactJS basically is an open-source JavaScript library which is used for building user interfaces specifically for single page applications...

Popular