AI Python Pandas 中如何按时间间隔对数据进行分组？

Python Pandas 中如何按时间间隔对数据进行分组？

原文:https://www . geesforgeks . org/如何按时间间隔分组数据-python-pandas/

先决条件:T2】熊猫

当您遇到时间序列分析时，按时间间隔对数据进行分组是非常明显的。时间序列是按时间顺序索引(或列出或绘制)的一系列数据点。最常见的是，时间序列是在连续等间隔的时间点上取得的序列。

熊猫提供了两个非常有用的功能，我们可以用它们来对数据进行分组。

重采样()— 该功能主要用于时间序列数据。这是一种方便的时间序列频率转换和重采样方法。对象必须具有类似日期时间的索引(日期时间索引、周期索引或时间增量索引)，或者将类似日期时间的值传递给 on 或 level 关键字。重采样根据实际数据生成唯一的采样分布。

语法: DataFrame.resample(规则，how=None，axis=0，fill_method=None，closed=None，label=None，约定='start '，kind=None，loffset=None，limit=None，base=0，on=None，level=None)

参数:

Rule: An offset string or object representing the target conversion.

axis: int, optional, default 0

Closed: {[right] [left]}

Label: {[right], [left]}

loffset: Adjust the time stamp of resampling

base: For the frequency of even subdivision for one day, the "origin" of the aggregation interval. For example, for the "5 minutes" frequency, the cardinality can be from 0 to 4. The default is 0.

on: For a DataFrame, use columns instead of indexes to resample. Columns must be similar to datetime.

Level: For a MultiIndex, the level (name or number) should be used for resampling. The level must be similar to datetime.

例:每月增加的数量，每年增加的总金额。

grouper -grouper allows users to specify the basis on which users want to analyze data.

语法: dataframe.groupby(pd。Grouper(key，level，freq，axis，sort，label，convention，base，Ioffset，origin，offset))

参数:

key: Select the target column to be grouped.

level: the level of the target index.

[t0 T0】 freq: group by specify the frequency. If the target column is an object with similar date and time,

T30

Convention: If grouper is PeriodIndex and freq parameter is passed,

base: only works when freq is passed.

ioffset: only works when freq is passed.

Origin: Timestamp is based on

Offset: The offset time increment is added to the origin 【T60】

方法

导入模块
加载或创建数据
根据需要对数据进行重新采样
将数据分组

下面给出了使用这种方法的实现:

使用中的数据帧: timeseries.csv

链接: 此处。

程序:使用重采样进行聚合

python 3

import numpy as np
import pandas as pd

# loading dataset
data = pd.read_csv('path of dataset')

# setting the index for the data
data = data.set_index(['created_at'])

# converting index to datetime index
data.index = pd.to_datetime(data.index)

# Changing start time for each hour, by default start time is at 0th minute
data.resample('W',  loffset='30Min30s').price.sum().head(2)
data.resample('W', loffset='30Min30s').price.sum().head(2)

# we can also aggregate it will show quantity added in each week
# as well as the total amount added in each week
data.resample('W', loffset='30Min30s').agg(
    {'price': 'sum', 'quantity': 'sum'}).head(5)

版权属于：月萌API www.moonapi.com，转载请注明出处

本文链接：https://moonapi.com/news/14750.html

AI 查看更多书籍

《GeeksForGeeks 人工智能中文教程 2022-06-24》

分类

最近更新

AI Python Pandas 中如何按时间间隔对数据进行分组？

Python Pandas 中如何按时间间隔对数据进行分组？

方法

python 3

留言

联系客服

数据知识

系统公告

开发文档

AI查看更多书籍

《GeeksForGeeks 人工智能中文教程 2022-06-24》

AI Python Pandas 中如何按时间间隔对数据进行分组？

Python Pandas 中如何按时间间隔对数据进行分组？

方法

python 3

留言

联系客服

AI 查看更多书籍