My comprehensive understanding of core performance issues of databases, my Optimization Techniques result in maximizing the speed and efficiency of instances and SQL queries for databases like Oracle, MySQL, PostgreSQL.
With a deeper understanding of Capacity Management challenges, my estimation of the object’s growth and storage usages is highly accurate, which improves visibility for the management on resources’ capacity planning and also helps in saving costs.
Defining guidelines for organizations and database administrators helps them in devising a backup strategy specific to their environment. My objective is to focus beyond “valid copy of backup” while creating a backup strategy.
Focusing on continuous evaluation of the database environment is the key for me to avoid any security lapses introduced in due course of the development lifecycle. Attaching equal importance to the identification of vulnerabilities of the database makes my solution more secure.
A good data model not only adds flexibility for the developers but also add spices to database performance. My sound expertise of almost two decades on the database makes it easy for me to prepare a best suited, specifically designed data model design for any situations like multi-lingual, data in TBs, Monolithic or Microservice.
Associate Principal EngineerPerformance Tuning | Database Architect | DBA
My expertise covers the area of Performance Tuning, Backup Strategy & Database Security on Oracle, PostgreSQL and MySQL databases.
My expertise extends to Capacity Planning, PL/SQL Application development, Data migrations and conversions, Database Migration including Time series & NoSQL databases.
With a deeper understanding of databases, I am Successfully able to convert database-related customer escalations into their appreciations in the quickest time on many occasions. I have worked extensively on statistics & histograms, SQL Plan stability, SQL trace files, Profiler, database instance optimizations, query optimizations, Database modeling, pg_stat_statements, pg_audit, PostGIS, managing tablespaces & storage. Also, successfully orchestrated migration of multi-terabyte database into Production.
I am currently associated with Nagarro Software Pvt Ltd
Nagarro Software Pvt Ltd, Gurgaon
Polaris Software Lab, Gurgaon
Vayam Technology, NOIDA
Espire Infolabs Pvt Ltd, Gurgaon
Krishna Maruti Ltd, Gurgaon
There are multiple ways of installation of the PostgreSQL database on Unix like platforms. We will explore the installation of PostgreSQL version 11 on Ubuntu 18.04 LTS using apt installation. Let’s start. Login to Ubuntu and check the version of Ubuntu. postgres@sanjeeva:/home/sanjeeva/postgres$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.5 LTS Release: Read more about PostgreSQL Installation on Ubuntu[…]
Research papers of “System R” from IBM were initially picked up by two professors, Michael Stonebraker and Eugene Wong, at Berkeley University, California. This resulted in a new database called INteractive Graphics REtrieval System i.e. “Ingres“. The work done by this duo for Ingres becomes the foundation of many relational databases like MS SQL Server, Read more about Evolvement of PostgreSQL: Background[…]
Indexes are a very important structure in any database and the same is true with MongoDB as well. It adds spices to the performance of your query. It is also very important to know what and how many indexes are created on your DB. In the normal life of DBA, one needs to find out Read more about How To: List all indexes in MongoDB[…]
Many a time there are required to create a new collection from existing ones with the same or different size. In this blog, we will see how to create a new collection from the existing one and with a smaller size. For Sample data, We may use Kaggle to get huge data set. In my collection, I Read more about How To: Creating a collection subset from collection in MongoDB[…]
This is second part of my installation blog on InfluxDB. In the First part of this blog we have discussed single instance installation. In this part we are going to discuss cluster installation for InfluxDB. For cluster installation which is not a production ready system InfluxDB provides a special cluster installation package which they called Read more about InfluxDB: How to perform Quick cluster installation on CentOS[…]
Among many Time Series databases (TSDB), InfluxDB is able to secure its position in the market. This is because InfluxDB provides many features that give added advantages and make it sustainable in various scenarios. Besides these, it also provides easy installation and configuration. InfluxDB gives flexibility to the user to install a single instance, QuickStart Read more about InfluxDB: How to perform a single node installation on CentOS[…]
Rapid growth of sensor-based, IoTs, social media, financial data like stock market activities and many other information streaming platforms created opportunity to design a whole new database which can capture streaming information with highlighting the importance of time into it. This is so because even traditional RDBMS was not able to efficiently handle complex business Read more about Time Series Database : Evolvement[…]
With its inception in 2006, Amazon AWS has definitely gone a long way. Engineers from Amazon worked really-really well which has not only completely changed the horizon of cloud but also emerges as one of the boon for any business to adopt. Although, there are many other vendors available in the cloud market and they Read more about Cloud database war: Advantage shifting to Red?[…]
Storage engine is one of the key component of any database. It is, in fact, a software module which is used by database management system to perform all storage related operations e.g. create information, read information and update any information. The term storage means both disk storage and memory storage. Choosing right storage engine is Read more about WiredTiger: A game changer for MongoDB[…]
There are times when you require to drop your existing database for more than one reason. Dropping a database is not a tough job at all, if you are very sure that which database you should drop. Problem Statement: How to drop database. Status or mode of database in which it can be dropped Step Read more about Drop a database[…]
Problem Statement: Restore entire database using Data Pump. Restore table(s) Restore tablespace(s) Restore schema(s) Restore using Transportable tablespaces (TTS) Restore from multiple small sizes of dump files Restore in parallel mode Approach: There are single shot solution to all the above problem statement and it is IMPDP in Data Pump. It is one of various Read more about Data Pump: impdp[…]
In real production world “Prediction” of data growth is an important aspects of DBA life because this will allow business not only to foresee the real position in terms of existing hardware but also enable to plan the future expenses which should be spent on hardware(s) & storage. To generate data growth report, Oracle provides Read more about Trend of data growth in Oracle Database[…]
Problem Statement: Backup entire database using Data Pump. Backup table(s) Backup tablespace(s) Backup schema(s) Backup using Transportable tablespaces (TTS) Generate multiple small sizes of dump files Backup in parallel mode Approach: There are single shot solution to all the above problem statement and it is Data Pump. It is one of various backup tools provided Read more about Oracle Data Pump: expdp & impdp[…]
Among various techniques of backing up your database Oracle provides data pump as one of tool which they are constantly improving and making this tool sharper release by release. Since its first launch with 10g version, it has improved a lot not only in terms of its new features but in terms of performance as Read more about Data Pump: a tool to backup and restore database[…]
MongoDB is one of the document oriented open source database developed in c++, first come into shape in 2007 when in order to overcome the shortfall of existing database while working for an advertising company “DoubleClick” development team has decided to go further rather than struggling with database. The team of this advertising company was Read more about MongoDB Enterprise Edition Installation – Ubuntu[…]
MongoDB is one of the document oriented open source database developed in c++, first come into shape in 2007 when in order to overcome the shortfall of existing database while working for an advertising company “DoubleClick” development team has decided to go further rather than struggling with database. The team of this advertising company was Read more about How to: Uninstallation of MongoDB on Ubuntu[…]
MongoDB is one of the document oriented open source database developed in c++, first come into shape in 2007 when in order to overcome the shortfall of existing database while working for an advertising company “DoubleClick” development team has decided to go further rather than struggling with database. The team of this advertising company was Read more about MongoDB Installation – Ubuntu[…]
About NoSQL: Let’s understand about the NoSQL. We will explore MongoDB and it’s inception later. Like any other type of database NoSQL database also provides a mechanism to store and retrieve which is modeled in such a way so that it should be different than Relational database. In short, a NoSQL database does not store Read more about Mongo DB – An Introduction[…]
Problem statement: How to move data files from one location to another on same storage. How to move data files from one storage to another. How to rename data files to make data file name standardized Environment / Scenario: You have a database where you have to move your data files from old slower storage Read more about How to re-organize your data files of a tablespace[…]
Problem statement: How to migrate huge data from One DB to another DB. Multi-Terabyte data loaded on one database should be copied to another database. Environment: You have multi-terabyte Database Your database is growing on daily basis, based on data feeds. Number of Indexes on these tables are very high, and thus, size of indexes Read more about How to copy Multi terabyte data to another Database[…]
Problem Statement: Move DB (with Oracle Binaries) on New Storage Create new DEV/UAT from Production. Approach: While creating new UAT or DEV from production and make this version of oracle to the same patch set level as of production, there are more than one approach you can follow. For example, either you can install from Read more about Move your DB with Oracle Binaries[…]
Problem Statement: Load millions of rows from flat files (csv) to database Load one table from another huge table. Speeding Insert statements Approach: For creating record in database using insert there are two methods. One is conventional and the other is direct path. If we look at the performance aspects of both the approach, latter Read more about How to: Speed-up your Inserts[…]
Once this shuffling completed, it is where REDUCE come into action. Its task is to process the input given by SHUFFLE into the output so that user can understand what is the result of the file processed by hadoop. After shuffling completed, it is clear that one word will be processed by only one DN and not Read more about MapReduce Unwinding … Reduce[…]
This is in continuation of MapReduce Processing …… This output will be input for next process which is SORT. Sort takes this [L<K,V>] and sorts all the words in order of alpha bates (a to z) on each DN. Sorted arrangement on DNs : DN -1: NODE – 1 [ L (K, V)] PKT-1(K) V Read more about MapReduce Unwinding … Sort & Shuffle[…]
In last discussion on MapReduce, we discussed the algorithm which is used by Hadoop for data processing using MapReduce. Now its time to understand this in detail with help of an example. Lets consider our scenario : We have 7 Node cluster where 1 Node is Name Node (NN) and rest of 6 node is Read more about MapReduce Unwinding. . . . . Map[…]
With discussion, in my last blog, about “How Hadoop manages Fault Tolerance” within its cluster while processing data, it is now time to discuss the algorithm which MapReduce used to process these data. It is Name Node (NN) where a user submits his request to process data and submits his data files. As soon as NN receives data Read more about MapReduce Unwinding. . . . . .Algorithm[…]
Before we see the intermediate data produced by the mapper, it would be quite interesting to see the fault tolerant aspects of Hadoop with respect to MapReduce processing. Once Name node (NN) received data files which has to be processed, it splits data files to assign it to Data Node (DN). This assignment would be Read more about MapReduce Unwinding. . . . . . Fault Tolerance[…]
The philosophy of Map Reduce workings is straight forward and can be summarized in 6 steps. Whatever data we provide as input to Hadoop, it first splits these data into smaller no of pieces. Typically, the size of data splitted is limited to 64MB. If a file of 1 TB is arrived to process on data node, Read more about MapReduce Unwinding. . . . . Philosophy[…]
MapReduce is a programing paradigm which provide an interface for developers to map end user requirements (any type of analysis on data) to code. This framework is one of the core component of Hadoop. The way it provides fault tolerant and massive scalability across hundreds or thousands of servers in a cluster for processing of Read more about MapReduce : Internals[…]
Inspired from Google File System which was developed using C++ during 2003 by Google to enhance its search engine, Hadoop Distributed File System (HDFS), a Java based file system, becomes the core components of Hadoop. With its fault tolerant and self healing features, HDFS enables Hadoop to harness the true capability of distributed processing techniques by turning Read more about HDFS Architecture Explained[…]
Because of the limitation of currently available Enterprise data warehousing tools, Organizations were not able to consolidate their data at one place to maintain faster data processing. Traditional ETL tools may take hours, days and sometimes even weeks. Performances of these tools are limited by two Hardware limitations. The vertical hardware scalability: Hardware can be scaled Read more about MAGIC OF HADOOP[…]
At the outset of twenty-first century, somewhere 1999-2000, due to increasing popularity of XML and JAVA, internet was evolving faster than ever. As the world wide web grew at dizzying pace, though current search engine technologies were working fine, a better open source search engine was the need of the hour to cater the future Read more about Journey of Hadoop[…]
Innovations in technologies made the resources cheaper than earlier. This enables organizations to store more data at lower cost and thus increasing the size of data. Gradually it becomes bigger and now it moves from Megabytes (MB) to Petabytes (1e+9 MB). This huge increase in data requires some different kind of processing and ways of Read more about Big Data: An Introduction[…]