by shigemk2

当面は技術的なことしか書かない

Pandasで実現したいサンプル

SELECT name,
       COALESCE(ROUND((test_x / test_y) * 100, 2), 0) as percentage
FROM
  (SELECT name,
          TYPE,
          text_x
   FROM INPUT
   WHERE TYPE = 'alpha') d1
LEFT OUTER JOIN
  (SELECT name,
          TYPE,
          text_y
   FROM INPUT
   WHERE TYPE = 'beta') d2 ON d1.name = d2.name

こういうクエリをPandasで実現したい

#coding: utf-8
import pandas as pd
import numpy as np
input_df = pd.read_csv('input.csv')
d1 = input_df[input_df['type'] == 'alpha']
d2 = input_df[input_df['type'] == 'beta']
merged_df = pd.merge(d1, d2, on='name')
dfp = pd.DataFrame(merged_df['name'], columns=['name'])
dfp['percentage'] = merged_df['test_x'].div(merged_df['test_y']).mul(100).round(2).replace([np.inf, -np.inf], 0)
# dfp.to_csv('./output.csv', mode='w', index=False, header=True, encoding="utf-8")