Elasticsearch.Nest 教程系列 8 聚合：Writing Aggregations | 使用聚合

创建时间: 2020-01-22 05:18:01 | 最后更新: 2020-01-23 04:15:44 本文总阅读量: 次

本系列博文是“伪”官方文档翻译（更加本土化），并非完全将官方文档进行翻译，而是在查阅、测试原始文档并转换为自己真知灼见后的“准”翻译。有不同见解 / 说明不周的地方，还请海涵、不吝拍砖：）

官方文档见此：https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/introduction.html

本系列对应的版本环境：ElasticSearch@7.3.1，NEST@7.3.1，IDE 和开发平台默认为 VS2019，.NET CORE 2.1

可以简单将 ES 中的聚合和 Sql server 中的“聚合函数（如 SUM，COUNT 等”）相关联。聚合可以嵌套，通过聚合可以找出某个字段的最大值，最小值，平均值，以及对字段进行求和操作等复杂数据的构建。

另外，ES 还提出了 buckets（桶）这个概念，你可以简单理解为相当于是 Sql server 中的分组（GROUP BY），即在 ES 中的称 GROUP BY 为“分桶”。

关于 Elasticsearch 中的聚合说明，可以见此。

编写聚合

Nest 提供了 3 种方式来让你使用聚合：

通过 lambda 表达式的方式。
通过内建的请求对象 AggregationDictionary。
通过结合二元运算符来简化 AggregationDictionary 的使用。

假设有以下 Project 类：

public class Project
{
    public string Name { get; set; }
    public int Quantity { get; set; }
}

三种方式的请求命令见下方：

POST /project/_search?typed_keys=true
{
	"aggs": { //关键字 aggregations，可以用 aggs 简写
		"average_quantity": { //聚合的名字
			"avg": {  //聚合的类型，可以理解为相当于 sql server 中的聚合函数
				"field": "quantity"  //聚合体，对哪些字段进行聚合
			}
		},
		"max_quantity": {
			"max": {
				"field": "quantity"
			}
		},
		"min_quantity": {
			"min": {
				"field": "quantity"
			}
		}
	}
}

lambda 方式

通过 lambda 表达式来使用聚合是简洁的方式

var searchResponse = _client.Search<Project>(s => s
    .Aggregations(aggs => aggs
        .Average("average_quantity", avg => avg.Field(p => p.Quantity))
        .Max("max_quantity", avg => avg.Field(p => p.Quantity))
        .Min("min_quantity", avg => avg.Field(p => p.Quantity))
    )
);

响应结果如下：

{
	"took": 2,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 6,
			"relation": "eq"
		},
		"max_score": 1.0,
		"hits": [
			{
				"_index": "project",
				"_type": "_doc",
				"_id": "1",
				"_score": 1.0,
				"_source": {
					"name": "Emma",
					"quantity": 1
				}
			},
			{
				"_index": "project",
				"_type": "_doc",
				"_id": "2",
				"_score": 1.0,
				"_source": {
					"name": "Tran",
					"quantity": 2
				}
			},
			{
				"_index": "project",
				"_type": "_doc",
				"_id": "3",
				"_score": 1.0,
				"_source": {
					"name": "Lucy",
					"quantity": 3
				}
			},
			{
				"_index": "project",
				"_type": "_doc",
				"_id": "4",
				"_score": 1.0,
				"_source": {
					"name": "Geo",
					"quantity": 4
				}
			},
			{
				"_index": "project",
				"_type": "_doc",
				"_id": "5",
				"_score": 1.0,
				"_source": {
					"name": "Luby",
					"quantity": 5
				}
			},
			{
				"_index": "project",
				"_type": "_doc",
				"_id": "6",
				"_score": 1.0,
				"_source": {
					"name": "Han",
					"quantity": 6
				}
			}
		]
	},
	"aggregations": {
		"avg#average_quantity": {
			"value": 3.5
		},
		"max#max_quantity": {
			"value": 6.0
		},
		"min#min_quantity": {
			"value": 1.0
		}
	}
}

一般进行聚合查询的时候，并不需要 _source 的东西，所以你在进行聚合查询是，可以在查询语句上指定 size=0，这样就只会返回聚合的结果，方式如下：

var searchResponse = _client.Search<Project>(s => s
    .Size(0)  //显式指定为 0
    .Aggregations(aggs => aggs
        .Average("average_quantity", avg => avg.Field(p => p.Quantity))
        .Max("max_quantity", avg => avg.Field(p => p.Quantity))
        .Min("min_quantity", avg => avg.Field(p => p.Quantity))
    )
);

调整后的返回结果如下：

{
	"took": 3,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 6,
			"relation": "eq"
		},
		"max_score": null,
		"hits": []
	},
	"aggregations": {
		"avg#average_quantity": {
			"value": 3.5
		},
		"max#max_quantity": {
			"value": 6.0
		},
		"min#min_quantity": {
			"value": 1.0
		}
	}
}

通过内建对象 AggregationDictionary

以下代码的效果和通过 lambda 表达式的效果一样

var searchRequest = new SearchRequest<Project>
{
    Size = 0,
    Aggregations = new AggregationDictionary
    {
        {"average_quantity", new AverageAggregation("average_quantity", "quantity")},
        {"max_quantity", new MaxAggregation("max_quantity", "quantity")},
        {"min_quantity", new MinAggregation("min_quantity", "quantity")},
    }
};
var searchResponse = _client.Search<Project>(searchRequest);

这种方式在可读性上较差。

通过结合二元运算符来简化 AggregationDictionary 的使用

通过二元运算符，可以让代码的可读性更高，以下代码等效于上方：

var searchRequest = new SearchRequest<Project>
{
    Size = 0,
    Aggregations = new AverageAggregation("average_quantity", "quantity")
    &&new MaxAggregation("max_quantity", "quantity")
    &&new MinAggregation("min_quantity", "quantity")
};
var searchResponse = _client.Search<Project>(searchRequest);

获取响应结果

通过使用响应模型的 .Aggregations 属性，可以让你得到聚合的结果，如下：

保留关键字

在使用聚合功能的时候，需要避免跟 ES 保留关键字冲突，如以下关键字（包含但不限于）：

“score”
“value_as_string”
“keys”
“max_score”

DeepThought

知识需沉淀，更需分享