FAQ - Frequently Asked Questions.

Table of Contents
Cloud Computing
AWS

AWS stands for Amazon Web Service; it is a collection of remote computing services, also known as cloud computing platforms. This new realm of cloud computing is also known as IaaS, or Infrastructure as a Service.

AMI stands for Amazon Machine Image. 

It’s a template that provides the information (an operating system, an application server, and applications) required to launch an instance, which is a copy of the AMI running as a virtual server in the cloud. You can launch instances from as many different AMIs as you need.

The buffer is used to make the system more robust for managing traffic or load by synchronizing different components. Usually, components receive and process requests in an unbalanced way, With the help of a buffer, the components will be balanced and will work at the same speed to provide faster services.

Amazon Web Services provides a wide range of services. EC2 is one of AWS's most popular services.

EC2, or Elastic Compute Cloud, allows users to rent virtual computers on which they run their own computer applications. It comes with a scalable compute capacity. It's cloud-based, on-demand, ready-to-use hardware that comes with a variety of configuration options.

Renting virtual machines, storing data on virtual devices using EBS, distributing the load across machines using ELB, and scaling services using auto-scaling groups are the main features of EC2.

DynamoDB is a fully managed, serverless, key-value NoSQL database. It is designed to run high-performance applications at any scale.

It offers a built-in security system and supports continuous backups, automated multi-region replication, in-memory caching, and data export applications.

DynamoDB uses hashing and B-trees to manage data.

Incoming data is distributed into different partitions by hashing on the partition key. 

Each partition can store up to 10GB of data and handle, by default, 1,000 write capacity units (WCU) and 3,000 read capacity units (RCU).

AWS Lambda is a computing service that runs code without installation or managing servers.

It was launched in 2014, and since then, it has seen tremendous growth in popularity.

AWS Lambda makes it easy to build, deploy, and manage applications without worrying about the underlying infrastructure required to support them.

AWS Lambda supports programming languages such as Python, Java, C#, and Go.

AWS Lambda runs code in response to events and automatically manages the compute resources. 

It can be used in various ways, including as a trigger for other AWS services.

Lambda is a pay-as-you-go service, so you only pay for what you use. You are charged based on the number of requests for your functions as well as the time it takes for your code to execute

GCP
Software Testing
Automation Testing

Automation testing or Test Automation is the process of automating the manual process of testing the application/system under test. Automation testing involves the use of a separate testing tool that lets you create test scripts that can be executed repeatedly and don’t require any manual intervention.

Selenium
  1. It’s free and open-source
  2. It has a large user base and a helpful community.
  3. It has cross-browser compatibility (Firefox, Chrome, Internet Explorer, Safari, etc.).
  4. It has great platform compatibility (Windows, Mac OS, Linux, etc.).
  5. supports multiple programming languages (Java, C#, Ruby, Python, Pearl, etc.)
  6. Has fresh and regular repository developments.
  7. supports distributed testing.
  1. Selenium is one of the most popular automated testing suites. Selenium is designed in a way to support and encourage automated testing of functional aspects of web-based applications across a wide range of browsers and platforms. Due to its existence in the open-source community, it has become one of the most accepted tools amongst testing professionals. For the same reason, Selenium is not just a single tool or a utility but rather a package of several testing tools, and for the same reason, it is referred to as a suite. Each of these tools is designed to cater to different testing and test environment requirements
Big Data

Big data is defined as the voluminous amount of structured, unstructured, or semi-structured data that has huge potential for mining but is so large that it cannot be processed using traditional database systems. Big data is characterized by its high velocity, volume, and variety and requires cost-effective and innovative methods for information processing to draw meaningful business insights. More than the volume of the data, the nature of the data defines whether it is considered Big Data or not.

IBM has a nice, simple explanation of the four critical features of big data:

  1. Volume – Scale of data
  2. Velocity – Analysis of streaming data
  3. Variety – Different forms of data
  4. Veracity – Uncertainty of data

Big Data analytics is a complex process of examining big data to uncover information such as hidden patterns, correlations, market trends, etc.

Big data refers to any large and complex collection of data, a combination of structured, semi-structured, and unstructured data.

In Big Data analytics, one will need to use sophisticated technological tools such as automation tools or parallel computing tools.

Hadoop, Spark, Hive, and Pig are the most commonly used tools in the Hadoop Ecosystem.

Libraries like PySpark, MLib, Pig Latin, etc. are used in Big Data Analysis.

Big Data is used by industries such as banking, retail, finance, etc.

Big Data Testing

Big Data Testing involves testing the various data storage tools, frameworks, and processes for big data apps. 

Big data testing focuses on the performance and functional testing of the application framework's data processing components.

It focuses on test plans, estimations, test cases, test scripts, error tracking, reporting, and other Hadoop testing ideas. 

Big Data Testing underlines the need for understanding the processes involved in managing large volumes of data, processing applications that work with huge volumes of data, and more.

Spark Hive
Hadoop
Data Science

Python is the most popular language in data science. 

Used by more than 80% of data scientists to extract insights from data, Python is a powerful yet easy-to-learn coding language that you can leverage to solve real-world problems.

Data Analytics

Data analytics is the process of extracting meaningful information from data.

In Data Analytics, structured data is used for analysis.

 

Statistical modeling & predictive modeling are done with simple tools.

Microsoft Power BI, SAP Business Objects, Sisense, and TIBCO Spotfire are the most commonly used data analytics tools.

TensorFlow, NumPy, and SciPy libraries are used for data analysis.

Data analytics is used in nearly every industry (IT, Manufacturing, Marketing, Healthcare) and every aspect of our lives, like politics, welfare schemes, and more.

Data Analysis

Data Analysis includes defining a data set, investigating it, cleaning it, and transforming it into something useful. It is used in descriptive analysis, exploratory analysis, and obtaining useful insights from the data.

Data Integration

Data integration is a process of combining multiple sets of data into one by matching, merging, and transforming the data. It can be done manually or automatically.

In the past, companies have used data integration to combine their existing databases. But now with the advancements in big data technologies such as Hadoop and Spark, companies are using it to integrate their business systems with external systems such as social media platforms like Twitter, Facebook, etc.

ETL Testing

ETL stands for Extract, Transform, and Load. It is an important concept in Data Warehousing systems. Extraction stands for extracting data from different data sources, such as transactional systems or applications. Transformation stands for applying the conversion rules to data so that it becomes suitable for analytical reporting. The loading process involves moving the data into the target system, normally a data warehouse.

Data warehousing is the process of storing, organizing, and retrieving data in a structured way to make it available for the decision-making processes. The main purpose of Data Warehousing is to provide information about the company’s performance at any given point in time for its management to make decisions about its future strategies.

Database Testing

Database testing is typically performed on transactional systems. The transactional database receives data from various applications.

It is tested with the goal of having proper data in the columns.

It is used to integrate data from various applications and to assess server impact.

An Entity-Relationship Model is used.

Database testing is used in OLTP databases.

It uses normalised data with joins.

QTP and Selenium are commonly used tools in database testing.

Programming
Python

Python has a wide range of libraries that are available for different purposes.

Python libraries are packages that provide additional functionality to Python programs.

Python libraries are often used for data science, machine learning, web scraping, and other automation tasks.

The most popular libraries are NumPy, Sci-Kit Learn, Matplotlib, Pandas, and TensorFlow.

Java

Java uses a Just-In-Time compiler to enable high performance. 

A Just-In-Time compiler is a program that turns Java bytecode, which is a program that contains instructions that must be interpreted, into instructions that can be sent directly to the processor.

Java runs on a variety of platforms, such as Windows, Mac OS, and the various versions of UNIX/Linux like HP-Unix, Sun Solaris, Redhat Linux, Ubuntu, CentOS, etc.

Core Java
Advance Java
SQL
C
C++
C#
JavaScript
Business Analytics

Business analytics is a subset of Business Intelligence where a professional - Business analyst uses automated data analysis practices, tools like SAP Business Objects or Tableau for statistical models, and other quantitative methods to process historical business data.

This iterative, methodical exploration of an organization's data to drive decision-making, with a focus on statistical analysis helps in business strategy and optimizing performance.

Business analytics undergoing significant, disruptive developments that will fundamentally alter how the industry and customers view analytics. One of the main drivers of this transition is the exponential growth of big data.

Three types of business analytics: 

Descriptive, Predictive, and Prescriptive

Business Analyst analyzes that data to identify trends, patterns, and root causes in the specific areas of marketing, human resources, finance, and operations.

 

Specifically, Business Analyst analyzes historical business data to identify trends, patterns, and root causes to make data-driven business decisions based on those insights.

Business Analysts use a set of automated data analysis practices, tools, and services that helps understand both what is happening in your business and why to improve decision-making and help plan for the future.

Artificial Intelligence
Machine Learning
Deep Learning
DevOps