Python tips and tricks
These are a list of helpful Python tips & tricks for my workflow
- [ ] zip
- [ ] List/Dict comprehensions
- [ ] Generators
- [ ] pandas
zip¶
z1 = zip(l1,l2)
allows us to combine two lists (i.e., l1
,l2
) together such that we get a list of tuples. This is useful if you have different lists (i.e., names, phone number, addresses, etc.) that are all connected and need to group them up togther quickly.
list(z1)
converts this zip
object into a list of tuples.
*z1
unpacks the zip object. Once you unpack the object, there are no items in the zip
variable
l1 = range(0,5)
l2 = ['a','b','c','d','e','f']
z1 = zip(l1,l2)
print(type(z1))
print(*z1)
z1 = zip(l1,l2)
listZ = list(z1)
print(listZ)
List/Dict comprehensions¶
- [ ] List comprehension.
- [ ] List comprehension with if conditional.
- [ ] List comprehension with nexted if conditional.
- [ ] List comprehension with if-else conditional.
- [ ] Nested List comprehensions.
- [ ] Dict comprehension.
List comprehension¶
x = ['alpha','beta','gamma','theta']
x = [print(val) for val in x]
List comprehension (if-conditional)¶
if
conditionals come after the for
loop
x = list(range(0,100,5))
print('x:{0}'.format(x))
cond_x = [val for val in x if val > 50]
print('y:{0}'.format(cond_x))
List comprehension (nested if-conditional)¶
x = list(range(0,100,5))
print('x:{0}'.format(x))
cond_x = [val for val in x if val > 50 if val % 2]
print('y:{0}'.format(cond_x))
List comprehension (if-else conditional)¶
if-else
conditionals come before the for
loop
x = list(range(0,100,5))
print('x:{0}'.format(x))
cond_x = [val if val > 50 else 0 for val in x]
print('y:{0}'.format(cond_x))
Nested List comprehensions¶
We show how to perform nested for
loops using list comprehensions. The first code is how we would write it using a for
loop.
my_list = []
for x in [10, 25, 50]:
for y in [1, 3, 5]:
my_list.append(x * y)
print(my_list)
Below is how to write a nested for loop using list comprehensions
nested_my_list = [ x*y for x in [10,25,50] for y in [1,3,5] ]
print(nested_my_list)
Dict comprehension¶
COMING SOON!
Generator expressions¶
Generators have same syntax as list comprehensions except using ()
instead of []
. You have to iterate over generators using .items()
or use the next(genObj)
to access each item in a generator.
list = ['tracy','clarissa','tom','hyacinth','sowhattoo']
gen_expr = ( len(l) for l in list )
print('item 1:{0}'.format(next(gen_expr))) # prints first item
print('item 2:{0}'.format(next(gen_expr))) # prints second item
print(*gen_expr) # prints all remaining items items
[print(l) for l in gen_expr] # will be empty list because of use of *gen_expr
print(','.join(list))
Pandas¶
- Dataframe manipulation
- Filtering
import os
import seaborn as sns
import pandas as pd
import numpy as np
iris = sns.load_dataset('iris')
print(iris.head())
iris.info()
iris.describe()
DataFrame extraction: When you have single brackets ([]
) for selecting columns of a DataFrame
, it returns a Series
. If you use double brackets ([[]]
) it returns a DataFrame
.
sepal_data = iris['sepal_length']
print(type(sepal_data))
sepal_data = iris[['sepal_length']]
print(type(sepal_data))
Filtering data: When filtering across multiple criteria, remember to use np.logical_and
or np.logical_or
filter_data1 = iris[iris['sepal_length']>5]
filter_data2 = iris[np.logical_and(iris['sepal_length']>5,iris['petal_width']>0.3)]
filter_data1.head()
filter_data2.head()
Reading csv: When we read csv
files, sometimes pandas is unable to recognize the format. We have two options are to:
- Read in the csv file and perform a conversion later
- Write a dateparser for the
pd.read_csv
command
currDir = os.getcwd()
fileName = currDir + '\\inputs\\' + 'FF_10_Industry_Portfolios.CSV'
df_10indus_m = pd.read_csv(fileName,skiprows=11,nrows=1107,index_col=0,parse_dates=True)
df_10indus_m.head()
Perform a conversion on the datetime index
timeformat = '%Y%m' # Can be as complex as '%Y-%m-%d %H:%M'
df_10indus_m.index = pd.to_datetime(df_10indus_m.index,format='%Y%m')
df_10indus_m.head()
Write a date parser as shown below
dateparser = lambda x: pd.datetime.strptime(x,'%Y%m')
dateparser('192004')
df_10indus_m = pd.read_csv(fileName,skiprows=11,nrows=1107,index_col=0,parse_dates=True,date_parser=dateparser)
df_10indus_m.head()
Looping¶
- List:
for idx, val in enumerate(list):
returns the idx,val of thelist
- Dictionary:
for key,val in dict.items():
returns the key,val of thedict
- 2D array:
for item in np.nditer(2Darray):
returns every item in the 2Dnumpy array
- DataFrame:
for idx,info in df.iterrows():
returns the index row, and the information in that row as aSeries
Map vs. apply vs. applymap¶
Command | Description | Example |
---|---|---|
Map | Iterates over each element of a Series . |
'df["col1"].map(lambda x: 5+x)': Adds 5 to each element of col1 . df["col1"].map(lambda x: "BNE"+x) : Concatenate “BNE“ at the beginning of each element of column2 (column format is string). |
Apply | Applies a function along any axis of the DataFrame. | df[[‘col1’,’col2’]].apply(sum), it will returns the sum of all the values of col1 and col2. |
ApplyMap | Applies a function to each element of the DataFrame. | func = lambda x: x+2 df.applymap(func), will add 2 to each element of dataframe (all columns of dataframe must be numeric type) |
Counting items¶
- Use the
collections.defaultdict
whenever you can compared to a normal dict{}
as its faster. Usecollections.defaultdict(int)
when setting up a dictionary to count items. - Use the
collections.Counter
on anySeries
or data to get a list of tuples of the count of each value. - Use the
df["col1"].value_counts()
is another way to get a count of all items in that column.
iris.head()
Using value_counts¶
iris['species'].value_counts()
iris['sepal_length'].value_counts()
Using defautdict¶
import collections
spec_cnt = collections.defaultdict(int)
spec = iris['species']
for s in spec:
if s in spec_cnt.keys():
spec_cnt[s] += 1
else:
spec_cnt[s] = 1
print(spec_cnt.keys())
print(spec_cnt.values())
Using collections.Counter¶
collections.Counter(spec)
cnt_sl = collections.Counter(iris['sepal_length'])
cnt_sl.most_common(10)
Writing sophisticated functions¶
Command | Access |
---|---|
def func(*args) | for v in args |
def func(*kwargs) | for k, v in kwargs.items() |
Using reduce() and filter()¶
Coming soon!
Comments
Comments powered by Disqus