Once we data in DataFrame i.e. DataFrame is prepared, independent of source of data (csv,xls,db etc.) we can work with it, like it is a table in database, selecting element of our interest.

```
import pandas as pd
df1 = pd.read_csv('example.csv')
df1.head()
df1.shape
```

We have loaded data from csv file and created DataFrame. Now, we will see different select operation on this DataFrame.

### Selecting a single column in DataFrame

```
column1 = df1['column_name']
column1.head()
```

### Selecting multiple columns in DataFrame

```
cols = df1['column1','column2']
cols.head()
```

### Selecting rows using indexing [] in DataFrame

```
rows = df1[10:20]
print (rows)
rows = df1[10:20]['column1','column2']
print (rows)
```

### Selecting rows by lable (.loc[]) in DataFrame

```
df2 = df1.loc[10:20]
df3 = df1.loc[10:20,['column1','column2']]
print(df2)
print(df3)
```

### Selecting rows by position (.iloc[]) in DataFrame

`dfpos = df1.iloc[10:20,[3,4]] `

After looking at fetching required set of column we proceed to manipulation of DataFrame

## How to manipulate a DataFrame

### 1. Transpose

```
import pandas as pd
df1 = pd.read_csv('example.csv')
df2 = df1[10:20]['cols1','cols2']
print("transpose : {}".format(df2.T))
```

### 2. sort_values

```
df2 = df1.sort_values(by='column_name') # sorting by single column
df3 = df1.sort_values(by=['col1','col2']) # sorting by multiple column
print(df2)
print(df3)
```

### 3. sort_index

```
df2 = df1.sort_index()
print(df2)
```

### 4. Re-indexing

```
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(3),index=['a','b','c'])
print(df1)
df2 = df1.reindex([1,2,3])
print(df2)
```

### 5. Adding a new column

```
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(10))
print(df1)
df1['col_new'] = 'a'
print(df1)
```

### 6. Remove existing column

```
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(10))
print(df1)
df1['col_new'] = 'a'
print(df1)
del df1['col_new'] # del df1[1] will work same
```

### 7. Data at particular location by label

```
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(3,5),index=[1,2,3],columns=['a','b','c','d','e'])
val1 = df1.at(1,'a')
print(val1)
df1.at(1,'a') = 0 # assign value at particular location
```

### 8. Data at particular location by position

```
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(3,5),index=[1,2,3],columns=['a','b','c','d','e'])
val1 = df1.iat[1,1]
print(val1)
df1.at[1,1] = 0 # assign value at particular location
df1[df1>0] = 2 # assign data at all location based on a condition
```

### 9. Applying a method or function in DataFrame

```
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(3,5),index=[1,2,3],columns=['a','b','c','d','e'])
def add_five(number):
return number+5
df.apply(add_five,axis=2)
```

There are few more functions such as ** dropna(),fillna()** etc. which is used in manipulation of data. DataFrame also have many statiscal methods like

**etc.**

*info(),describe(),value_counts(),mean(),std()*We will learn filtering and iterating in DataFrame in upcoming post.

Keep learning!

Hope it helps 🙂

## Leave a Reply