Python如何连接SSL的Presto?
并将查询结果转为pandas dataFrame格式?
不加SSL的网上示例很多,但需要SSL参数的找很久没找到,这里解决了,贡献一下代码示例。
1 2 3
| import pandas as pd from sqlalchemy.engine import create_engine from pyhive import presto
|
sqlalchemy方式
1 2 3 4 5 6 7 8 9 10 11
| database_username = "bd-user" database_password = "xxxxxx" engine = create_engine( 'presto://bd-user@presto.bi.cn/hive/dwd', connect_args={ 'protocol': 'https', 'requests_kwargs':{'auth': HTTPBasicAuth(database_username,database_password), 'verify':False} } ) df_train = pd.read_sql("select syscode,diffday from dwd.demandsdeliverycycle_result WHERE diffday >= -30 and created_time >= '2020-01-01'",engine) print(df_train.head(10))
|
pyhive方式
1 2 3 4 5 6 7 8 9 10
| cursor = presto.connect(host="presto.bi.cn", port="443", schema="dwd", catalog='hive', protocol="https", username=database_username, password=database_password, requests_kwargs={'verify': False}).cursor() cursor.execute("select syscode,diffday from dwd.demandsdeliverycycle_result WHERE diffday >= -30 and created_time >= '2020-01-01'") data = cursor.fetchall() columnDes = cursor.description columnNames = [columnDes[i][0] for i in range(len(columnDes))] df_train = pd.DataFrame(data,columns=columnNames) print(df_train.head(10))
|
参考链接:https://github.com/dropbox/PyHive#sqlalchemy