文章作者:jqpeng
原文链接: java应用诊断和在线debug利器bistoury介绍与在K8S环境使用


Bistoury介绍

Bistoury 是去哪儿网开源的一个对应用透明,无侵入的java应用诊断工具,用于提升开发人员的诊断效率和能力,可以让开发人员无需登录机器或修改系统,就可以从日志、内存、线程、类信息、调试、机器和系统属性等各个方面对应用进行诊断,提升开发人员诊断问题的效率和能力。

Bistoury 集成了Alibaba开源的arthas和唯品会开源的vjtools,因此arthas和vjtools相关功能都可以在Bistoury中使用。
Arthas和vjtools通过命令行或类似的方式使用,Bistoury在保留命令行界面的基础上,还对很多命令提供了图形化界面,方面用户使用。

Bistoury 英文解释是外科手术刀,含义也就不言而喻了。

Screenshots

通过命令行界面查看日志,使用arthas和vjtools的各项功能
console

在线debug,在线应用调试神器
debug

线程级cpu监控,帮助你掌握线程级cpu使用率
jstack_dump

在web界面查看JVM运行信息,以及各种其它信息
jvm

动态给方法添加监控
monitor

线程dump
thread_dump

Bistoury架构分析

Bistoury核心组件包含agent,proxy,ui:

  • agent : 与需要诊断的应用部署到一起,负责具体的诊断命令执行,通过域名连接proxy
  • proxy:agent的代理,agent启动时会通过ws和proxy连接注册,proxy可以部署多个,推荐使用域名负载
  • ui:ui提供图形化和命令行界面,接收从用户传来的命令,传递命令给proxy,接收从proxy传来的结果并展示给用户。

一次命令执行的数据流向为 ui -> proxy -> agent -> proxy -> ui

具体分析一下:

  • proxy 先启动,将自己地址注册到zk
  • agent通过域名访问proxy,随机分配到一个proxy,在proxy注册自己
  • UI 访问一个具体的应用时,通过zk拿到所有的proxy,然后依次检查app对应的agent是否在该proxy,如果在,web网页连接这个proxy
  • web上输入一个命令:web->proxy->agent->proxy->ui

具体参见 https://github.com/qunarcorp/bistoury/blob/master/docs/cn/design/design.md

bistoury原理分析: https://www.jianshu.com/p/f7202e490156

总结下就是使用类似skywalking那样的agent技术,来监测和协助运行在JVM上的程序。

Bistoury快速开始

官方有一个快速开始文档: https://github.com/qunarcorp/bistoury/blob/master/docs/cn/quick_start.md

可以下载release包快速启动,就可以体验了。

首先我们将快速启动包 bistoury-quick-start.tar.gz 拷贝到想要安装的位置。

然后解压启动包:

tar -zxvf bistoury-quick-start.tar.gz
cd bistoury

最后是启动 Bistoury,因为 Bistoury 会用到 jstack 等操作,为了保证所有功能可用,需要使用和待诊断 JAVA 应用相同的用户启动。

假设应用进程 id 为 1024

  • 如果应用以本人用户启动,可以直接运行
./quick_start.sh -p 1024 start
  • 如果应用以其它帐号启动,比如 tomcat,需要指定一下用户然后运行
sudo -u tomcat ./quick_start.sh -p 1024 start
  • 停止运行
./quick_start.sh stop

Bistoury 在docker运行

官方的git仓库里,有一个docker分支,翻阅后找到相关文档。

官方的快速启动命令:

#!/bin/bash
#创建网络
echo "start create network"
docker network create --subnet=172.19.0.0/16 bistoury
#mysql 镜像
echo "start run mysql image"
docker run --name mysql -p 3307:3306 -e MYSQL_ROOT_PASSWORD=root -d -i --net bistoury --ip 172.19.0.7  registry.cn-hangzhou.aliyuncs.com/bistoury/bistoury-db
#zk 镜像
echo "start run zk image"
docker run -d -p 2181:2181 -it --net bistoury --ip 172.19.0.2 registry.cn-hangzhou.aliyuncs.com/bistoury/zk:latest
sleep 30
#proxy 镜像
echo "start run proxy module"
docker run -d -p 9880:9880 -p 9881:9881 -p 9090:9090 -i --net bistoury --ip 172.19.0.3 registry.cn-hangzhou.aliyuncs.com/bistoury/bistoury-proxy --real-ip $1 --zk-address 172.19.0.2:2181 --proxy-jdbc-url jdbc:mysql://172.19.0.7:3306/bistoury
#ui 镜像
echo "start run ui module"
docker run -p 9091:9091  -it -d --net bistoury --ip 172.19.0.4 registry.cn-hangzhou.aliyuncs.com/bistoury/bistoury-ui --zk-address 172.19.0.2:2181 --ui-jdbc-url jdbc:mysql://172.19.0.7:3306/bistoury
#boot 镜像
echo "start run demo application"
docker  run -it -d  -p 8686:8686 -i --net bistoury --ip 172.19.0.5 registry.cn-hangzhou.aliyuncs.com/bistoury/bistoury-demo --proxy-host $1:9090
docker  run -it -d  -p 8687:8686 -i --net bistoury --ip 172.19.0.6 registry.cn-hangzhou.aliyuncs.com/bistoury/bistoury-demo --proxy-host $1:9090

上面的命令不能直接运行,$1是需要替换成当前服务器IP,然后再运行就OK了。

Bistoury 在生产环境运行

官方推荐部署方式:

  • ui 独立部署,推荐部署在多台机器,并提供独立的域名
  • proxy 独立部署,推荐部署在多台机器,并提供独立的域名
  • agent 需要和应用部署在同一台机器上。推荐在测试环境全环境自动部署,线上环境提供单机一键部署,以及应用下所有机器一键部署
  • 独立的应用中心,管理所有功能内部应用和机器信息,这是一个和 Bistoury 相独立的系统,Bistoury 从中拿到不断更新的应用和机器信息

这里有个关键的点,应用中心,Bistoury内置了一个简单的应用中心,Bistoury里代码对应bistoury-application,ui和proxy都通过这个工程获取应用信息,官方默认实现了一个mysql版本的:

application

使用mysql的缺点是,你需要ui界面里手动维护应用以及应用的服务器,做个demo还OK,生产环境肯定不行。更优雅的方式是,用户系统应该在启动时自动注册到注册中心上,汇报自己的应用、机器信息(ip、域名等)、端口等信息。当然这个对大部分微服务架构来说,注册中心是标配的,因此实现一套bistoury-application-api接口即可。

bistoury-application-k8s(Bistoury on K8S)

我们项目组所有的应用都部署在K8S环境,因此要实现一个bistoury-application-k8s

拷贝bistoury-application-mysql项目,建立bistoury-application-k8s

简单对应下:

  • 一个应用对应一个deployment,对应一个application
  • 一个deployment里有n个pod,对应applicationServer

所以,我们只需要调用调用K8S API 获取deployment和pod即可。

首先引入相关jar包:

   <dependency>
            <groupId>io.kubernetes</groupId>
            <artifactId>client-java</artifactId>
            <version>8.0.0</version>
            <scope>compile</scope>
        </dependency>

初始化ApiClient

            ApiClient defaultClient = Configuration.getDefaultApiClient();
            defaultClient.setBasePath(k8sApiServer);
            ApiKeyAuth BearerToken = (ApiKeyAuth) defaultClient.getAuthentication("BearerToken");
            BearerToken.setApiKey(k8sToken);
            BearerToken.setApiKeyPrefix("Bearer");
            defaultClient.setVerifyingSsl(false);

获取deployment

区分下是获取所有namespace,还是获取指定的namespace

   private List<V1Deployment> getDeployments() throws ApiException {
        AppsV1Api appsV1Api = new AppsV1Api(k8SConfiguration.getApiClient());
        return k8SConfiguration.isAllNamespace()
                ? appsV1Api.listDeploymentForAllNamespaces(false, null, null, null, 0, null, null, 120, false).getItems()
                : getNamespacesDeployments(k8SConfiguration.getAllowedNamespace());
    }

    List<V1Deployment> getNamespacesDeployments(List<String> namespaces) {
        AppsV1Api appsV1Api = new AppsV1Api(k8SConfiguration.getApiClient());
        List<V1Deployment> deploymentList = new ArrayList<>();
        for (String nameSpace : namespaces) {
            try {
                deploymentList.addAll(appsV1Api.listNamespacedDeployment(nameSpace, null, null, null, null, null, 0, null, 120, false).getItems());
            } catch (ApiException e) {
                logger.error("get " + nameSpace + "'s deployment error", e);
            }
        }
        return deploymentList;
    }

转换为application:

    private List<Application> getApplications(List<V1Deployment> applist) {
        return applist.stream().map(this::getApplication).collect(Collectors.toList());
    }

    private Application getApplication(V1Deployment deployment) {
        Application application = new Application();
        application.setCreateTime(deployment.getMetadata().getCreationTimestamp().toDate());
        application.setCreator(deployment.getMetadata().getName());
        application.setGroupCode(deployment.getMetadata().getNamespace());
        application.setName(deployment.getMetadata().getName());
        application.setStatus(1);
        application.setCode(getAppCode(deployment.getMetadata().getNamespace(), deployment.getMetadata().getName()));
        return application;
    }

获取pod

获取pod相对麻烦点,需要先获取到V1Deployment,拿到部署的lableSelector,然后根据lableSelector选择pod:

 public List<AppServer> getAppServerByAppCode(final String appCode) {
        Preconditions.checkArgument(!Strings.isNullOrEmpty(appCode), "app code cannot be null or empty");

        try {
            V1Deployment deployment = getDeployMent(appCode);
            String nameSpace = appCode.split(APPCODE_SPLITTER)[0];
            Map<String, String> labelMap = Objects.requireNonNull(deployment.getSpec()).getSelector().getMatchLabels();
            StringBuilder lableSelector = new StringBuilder();
            labelMap.entrySet().stream().forEach(e -> {
                if (lableSelector.length() > 0) {
                    lableSelector.append(",");
                }
                lableSelector.append(e.getKey()).append("=").append(e.getValue());
            });

            CoreV1Api coreV1Api = new CoreV1Api(k8SConfiguration.getApiClient());
            V1PodList podList = coreV1Api.listNamespacedPod(nameSpace, null, false, null,
                    null, lableSelector.toString(), 200, null, 600, false);

            return podList.getItems().stream().map(pod -> {
                AppServer server = new AppServer();
                server.setAppCode(appCode);
                server.setHost(pod.getMetadata().getName());
                server.setIp(pod.getStatus().getPodIP());
                server.setLogDir(k8SConfiguration.getAppLogPath());
                server.setAutoJMapHistoEnable(true);
                server.setAutoJStackEnable(true);
                server.setPort(8080);
                return server;
            }).collect(Collectors.toList());

        } catch (ApiException e) {
            logger.error("get deployment's pod  error", e);
        }

        return null;

    }

最后,修改ui和proxy工程,将原来的mysql替换为k8s:

修改pom

应用引入bistoury agent

这块相对比较容易:

在需要调试的应用的Dockerfile里增加:

COPY  --from=hub.xfyun.cn/abkdev/bistoury-agent:2.0.11  /home/q/bistoury  /opt/bistoury

然后修改应用的启动脚本,在最前面增加:

BISTOURY_APP_LIB_CLASS="org.springframework.web.servlet.DispatcherServlet"

# default proxy
PROXY="bistoury-bistoury-proxy.incubation:9090"
AGENT_JAVA_HOME="/usr/local/openjdk-8/"

# env
if [[ -n $PROXY_HOST ]]; then
    PROXY=$PROXY_HOST
fi

TEMP=`getopt -o : --long proxy-host:,app-class:,agent-java-home: -- "$@"`

eval set -- "$TEMP"

while true; do
  case "$1" in
    --proxy-host )
      PROXY="$2"; shift 2 ;;
    --app-class )
      BISTOURY_APP_LIB_CLASS="$2"; shift 2 ;;
    --agent-java-home )
      AGENT_JAVA_HOME="$2"; shift 2 ;;
    * ) break ;;
  esac
done


echo "proxy host: "$PROXY_HOST
echo "app class: "$BISTOURY_APP_LIB_CLASS
echo "agent java home: "$AGENT_JAVA_HOME

在最后面增加:

APP_PID=`$AGENT_JAVA_HOME/bin/jps -l|awk '{if($2!="sun.tools.jps.Jps"){print $1 ;{exit}} }'`

echo "app pid: "$APP_PID

/opt/bistoury/agent/bin/bistoury-agent.sh -j $AGENT_JAVA_HOME -p $APP_PID -c $BISTOURY_APP_LIB_CLASS -s $PROXY -f start

集成测试

部署一个测试应用 agent-debug-demo,部署到jx namespace:

 agent-debug-demo

{
  "kind": "Deployment",
  "apiVersion": "extensions/v1beta1",
  "metadata": {
    "name": "agent-debug-demo",
    "namespace": "jx",
    "annotations": {
      "deployment.kubernetes.io/revision": "2"
    }
  },
  "spec": {
    "replicas": 1,
    "selector": {
      "matchLabels": {
        "app": "agent-debug-demo",
        "draft": "draft-app"
      }
    },
    "template": {
      "metadata": {
        "creationTimestamp": null,
        "labels": {
          "app": "agent-debug-demo",
          "draft": "draft-app"
        }
      },
      "spec": {
        "containers": [
          {
            "name": "springboot-rest-demo",
            "image": "hub.xxx.cn/abkdev/springboot-rest-demo:dev-113",
            "ports": [
              {
                "containerPort": 8080,
                "protocol": "TCP"
              }
            ],
            "env": [
              {
                "name": "SPRING_PROFILES_ACTIVE",
                "value": "dev"
              },
              {
                "name": "PROXY_HOST",
                "value": "$PROXY_HOST:9090"
              }
            ],
            "resources": {},
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "IfNotPresent"
          }
        ],
        "restartPolicy": "Always",
        "terminationGracePeriodSeconds": 10,
        "dnsPolicy": "ClusterFirst",
        "securityContext": {},
        "schedulerName": "default-scheduler"
      }
    },
    "strategy": {
      "type": "RollingUpdate",
      "rollingUpdate": {
        "maxUnavailable": 1,
        "maxSurge": 1
      }
    },
    "revisionHistoryLimit": 2147483647,
    "progressDeadlineSeconds": 2147483647
  },
  "status": {
    "observedGeneration": 2,
    "replicas": 1,
    "updatedReplicas": 1,
    "unavailableReplicas": 1,
    "conditions": [
      {
        "type": "Available",
        "status": "True",
        "lastUpdateTime": "2020-04-09T01:32:42Z",
        "lastTransitionTime": "2020-04-09T01:32:42Z",
        "reason": "MinimumReplicasAvailable",
        "message": "Deployment has minimum availability."
      }
    ]
  }
}

部署后:

K8S Dashboard

打开ui,查看:

应用名称显示为: namespace名称-部署名称

bistoury

ThreadDump

在线调试:

先选择应用:

选择应用

点击Debug,然后选择需要调试的类,

测试工程源代码为:

@SpringBootApplication
@Controller
public class RestPrometheusApplication {
@Autowiredprivate MeterRegistry registry;
@Autowiredprivate Environment env;
@GetMapping(path = "/", produces = "application/json")@ResponseBodypublic Map<String, Object> landingPage() {    Counter.builder("mymetric").tag("foo", "bar").register(registry).increment();    String profile = "default";    if(env.getActiveProfiles().length > 0){        profile = env.getActiveProfiles()[0];    }
    return singletonMap("hello", ""+ profile);}
public static void main(String[] args) {    SpringApplication.run(RestPrometheusApplication.class, args);}

}

因此,我们输入RestPrometheusApplication筛选:

选择Class

然后点击调试,可以看到,反编译出来了源代码:

在线debug1

在landingPage最后一行加一个端点,然后点击添加端点,最后访问该POD对应的服务,该pod对应的ip是170.22.149.37,因此我们访问:

curl http://170.22.149.37:8080
{"hello":"dev"}

再回到UI,可以看到成员变量,局部变量和调用堆栈等信息。

在线debug2

well down!


作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: 获取free-ss账号并导入SSR

pre

  • 如果不知道SSR,请略过
  • 下载SSR
  • 电脑配置host: 104.18.36.36 free-ss.site

获取账号

  • 打开free-ss.site,等待账号出现
  • F12打开开发者工具,console写入脚本,会自动解析账号,然后提示保存文件
function download(filename, text) {
  var element = document.createElement('a');
  element.setAttribute('href', 'data:text/plain;charset=utf-8,' + encodeURIComponent(text));
  element.setAttribute('download', filename);
 
  element.style.display = 'none';
  document.body.appendChild(element);
 
  element.click();
 
  document.body.removeChild(element);
}

var trList = $("#tbss tr")
ssList = []
for (let i = 1; i < trList.length; i++) {
      const tdList = trList.eq(i).find('td');
      ssList.push({
        server: tdList.eq(1).text(),
        server_port: tdList.eq(2).text(),
        password: tdList.eq(4).text(),
        method: tdList.eq(3).text() || 'aes-256-cfb',
        group:"crawl",
        enable: true,
        remarks: tdList.eq(6).text() + ' ' + (Math.ceil(Math.random() * 10000)),
        timeout: 5
      });
    }
download("ssr-list.txt",JSON.stringify({"configs" : ssList}))

导入SSR

  • 打开SSR软件,服务器-从文件导入服务器,选择刚生成那个文件
  • 服务器选择crawl分组的服务器
  • 启用服务器负载均衡

搞定!

文章作者:jqpeng
原文链接: 容器环境的JVM内存设置最佳实践

Docker和K8S的兴起,很多服务已经运行在容器环境,对于java程序,JVM设置是一个重要的环节。这里总结下我们项目里的最佳实践。

Java Heap基础知识

默认情况下,jvm自动分配的heap大小取决于机器配置,比如我们到一台64G内存服务器:

java -XX:+PrintFlagsFinal -version | grep -Ei "maxheapsize|maxram"
    uintx DefaultMaxRAMFraction                     = 4                                   {product}
    uintx MaxHeapSize                              := 16875782144                         {product}
 uint64_t MaxRAM                                    = 137438953472                        {pd product}
    uintx MaxRAMFraction                            = 4                                   {product}
   double MaxRAMPercentage                          = 25.000000                           {product}
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)

可以看到,JVM 分配的最大MaxHeapSize为 16G,计算公式如下:

MaxHeapSize = MaxRAM * 1 / MaxRAMFraction

MaxRAMFraction 默认是4,意味着,每个jvm最多使用25%的机器内存。

但是需要注意的是,JVM实际使用的内存会比heap内存大:

JVM内存  = heap 内存 + 线程stack内存 (XSS) * 线程数 + 启动开销(constant overhead)

默认的XSS通常在256KB到1MB,也就是说每个线程会分配最少256K额外的内存,constant overhead是JVM分配的其他内存。

我们可以通过-Xmx 指定最大堆大小。

java -XX:+PrintFlagsFinal -Xmx1g -version | grep -Ei "maxheapsize|maxram"
    uintx DefaultMaxRAMFraction                     = 4                                   {product}
    uintx MaxHeapSize                              := 1073741824                          {product}
 uint64_t MaxRAM                                    = 137438953472                        {pd product}
    uintx MaxRAMFraction                            = 4                                   {product}
   double MaxRAMPercentage                          = 25.000000                           {product}
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)

此外,还可以使用XX:MaxRAM来指定。

java -XX:+PrintFlagsFinal -XX:MaxRAM=1g -version | grep -Ei 

但是指定-Xmx或者MaxRAM需要了解机器的内存,更好的方式是设置MaxRAMFraction,以下是不同的Fraction对应的可用内存比例:

+----------------+-------------------+
| MaxRAMFraction | % of RAM for heap |
|----------------+-------------------|
|              1 |              100% |
|              2 |               50% |
|              3 |               33% |
|              4 |               25% |
+----------------+-------------------+

容器环境的Java Heap

容器环境,由于java获取不到容器的内存限制,只能获取到服务器的配置:

$ docker run --rm alpine free -m
             total     used     free   shared  buffers   cached
Mem:          1998     1565      432        0        8     1244
$ docker run --rm -m 256m alpine free -m
             total     used     free   shared  buffers   cached
Mem:          1998     1552      445        1        8     1244

这样容易引起不必要问题,例如限制容器使用100M内存,但是jvm根据服务器配置来分配初始化内存,导致java进程超过容器限制被kill掉。为了解决这个问题,可以设置-Xmx或者MaxRAM来解决,但就想第一部分描述的一样,这样太不优雅了!

为了解决这个问题,Java 10 引入了 +UseContainerSupport(默认情况下启用),通过这个特性,可以使得JVM在容器环境分配合理的堆内存。 并且,在JDK8U191版本之后,这个功能引入到了JDK 8,而JDK 8是广为使用的JDK版本。

UseContainerSupport

-XX:+UseContainerSupport允许JVM 从主机读取cgroup限制,例如可用的CPU和RAM,并进行相应的配置。这样当容器超过内存限制时,会抛出OOM异常,而不是杀死容器。
该特性在Java 8u191 +,10及更高版本上可用。

注意,在191版本后,-XX:{Min|Max}RAMFraction 被弃用,引入了-XX:MaxRAMPercentage,其值介于0.0到100.0之间,默认值为25.0。

最佳实践

拉取最新的openjdk:8-jre-alpine作为底包,截止这篇博客,最新的版本是212,>191

docker run -it --rm openjdk:8-jre-alpine java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (IcedTea 3.12.0) (Alpine 8.212.04-r0)
OpenJDK 64-Bit Server VM (build 25.212-b04, mixed mode)

我们构建一个基础镜像,dockerfile如下:

FROM openjdk:8-jre-alpine
MAINTAINER jadepeng

RUN echo "http://mirrors.aliyun.com/alpine/v3.6/main" > /etc/apk/repositories \
    && echo "http://mirrors.aliyun.com/alpine/v3.6/community" >> /etc/apk/repositories \
    && apk update upgrade \
    && apk add --no-cache procps unzip curl bash tzdata \
    && ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
    && echo "Asia/Shanghai" > /etc/timezone

RUN apk add --update ttf-dejavu && rm -rf /var/cache/apk/*

在应用的启动参数,设置 -XX:+UseContainerSupport,设置-XX:MaxRAMPercentage=75.0,这样为其他进程(debug、监控)留下足够的内存空间,又不会太浪费RAM。


作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: K8S基于ingress-nginx实现灰度发布

之前介绍过使用ambassador实现灰度发布,今天介绍如何使用ingre-nginx实现。

介绍

Ingress-Nginx 是一个K8S ingress工具,支持配置 Ingress Annotations 来实现不同场景下的灰度发布和测试。 Nginx Annotations 支持以下 4 种 Canary 规则:

  • nginx.ingress.kubernetes.io/canary-by-header:基于 Request Header 的流量切分,适用于灰度发布以及 A/B 测试。当 Request Header 设置为 always时,请求将会被一直发送到 Canary 版本;当 Request Header 设置为 never时,请求不会被发送到 Canary 入口;对于任何其他 Header 值,将忽略 Header,并通过优先级将请求与其他金丝雀规则进行优先级的比较。
  • nginx.ingress.kubernetes.io/canary-by-header-value:要匹配的 Request Header 的值,用于通知 Ingress 将请求路由到 Canary Ingress 中指定的服务。当 Request Header 设置为此值时,它将被路由到 Canary 入口。该规则允许用户自定义 Request Header 的值,必须与上一个 annotation (即:canary-by-header)一起使用。
  • nginx.ingress.kubernetes.io/canary-weight:基于服务权重的流量切分,适用于蓝绿部署,权重范围 0 - 100 按百分比将请求路由到 Canary Ingress 中指定的服务。权重为 0 意味着该金丝雀规则不会向 Canary 入口的服务发送任何请求。权重为 100 意味着所有请求都将被发送到 Canary 入口。
  • nginx.ingress.kubernetes.io/canary-by-cookie:基于 Cookie 的流量切分,适用于灰度发布与 A/B 测试。用于通知 Ingress 将请求路由到 Canary Ingress 中指定的服务的cookie。当 cookie 值设置为 always时,它将被路由到 Canary 入口;当 cookie 值设置为 never时,请求不会被发送到 Canary 入口;对于任何其他值,将忽略 cookie 并将请求与其他金丝雀规则进行优先级的比较。

注意:金丝雀规则按优先顺序进行如下排序:

canary-by-header - > canary-by-cookie - > canary-weight

我们可以把以上的四个 annotation 规则可以总体划分为以下两类:

  • 基于权重的 Canary 规则

基于权重

  • 基于用户请求的 Canary 规则

基于规则

注意: Ingress-Nginx 实在0.21.0 版本 中,引入的Canary 功能,因此要确保ingress版本OK

测试

应用准备

两个版本的服务,正常版本:

import static java.util.Collections.singletonMap;

@SpringBootApplication
@Controller
public class RestPrometheusApplication {
@Autowiredprivate MeterRegistry registry;
@GetMapping(path = "/", produces = "application/json")@ResponseBodypublic Map<String, Object> landingPage() {    Counter.builder("mymetric").tag("foo", "bar").register(registry).increment();    return singletonMap("hello", "ambassador");}
public static void main(String[] args) {    SpringApplication.run(RestPrometheusApplication.class, args);}

}

访问会输出:

{"hello":"ambassador"}    

灰度版本:

import static java.util.Collections.singletonMap;

@SpringBootApplication
@Controller
public class RestPrometheusApplication {
@Autowiredprivate MeterRegistry registry;
@GetMapping(path = "/", produces = "application/json")@ResponseBodypublic Map<String, Object> landingPage() {    Counter.builder("mymetric").tag("foo", "bar").register(registry).increment();    return singletonMap("hello", "ambassador, this is a gray version");}
public static void main(String[] args) {    SpringApplication.run(RestPrometheusApplication.class, args);}

}

访问会输出:

{"hello":"ambassador, this is a gray version"}    

ingress 配置

我们部署好两个服务,springboot-rest-demo是正常的服务,springboot-rest-demo-gray是灰度服务,我们来配置ingress,通过canary-by-header来实现:

正常服务的:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: springboot-rest-demo
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
  - host: springboot-rest.jadepeng.com
    http:
      paths:
      - backend:
          serviceName: springboot-rest-demo
          servicePort: 80

canary 的:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: springboot-rest-demo-gray
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-by-header: "canary"nginx.ingress.kubernetes.io/canary-by-header-value: "true"
spec:
  rules:
  - host: springboot-rest.jadepeng.com
    http:
      paths:
      - backend:
          serviceName: springboot-rest-demo-gray
          servicePort: 80

将上面的文件执行:

kubectl -n=default apply -f ingress-test.yml 
ingress.extensions/springboot-rest-demo created
ingress.extensions/springboot-rest-demo-gray created

执行测试,不添加header,访问的默认是正式版本:

# curl http://springboot-rest.jadepeng.com; echo
{"hello":"ambassador"}
# curl http://springboot-rest.jadepeng.com; echo
{"hello":"ambassador"}

添加header,可以看到,访问的已经是灰度版本了

# curl -H "canary: true" http://springboot-rest.jadepeng.com; echo
{"hello":"ambassador, this is a gray version"}

多实例Ingress controllers

参考


作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: 基于ambassador实现K8S灰度发布

为什么需要灰度发布

灰度发布(又名金丝雀发布)是指在黑与白之间,能够平滑过渡的一种发布方式。在其上可以进行A/B testing,即让一部分用户继续用产品特性A,一部分用户开始用产品特性B,如果用户对B没有什么反对意见,那么逐步扩大范围,把所有用户都迁移到B上面来。

总结下一些应用场景:

  • 微服务依赖很多组件,需要在实际环境验证
  • 部署新功能有风险,然后可以通过导流一小部分用户实际使用,来减小风险
  • 让特定的用户访问新版本,比如部署一个版本,只让测试使用
  • A/B Testing,部署两个版本,进行版本对比,比如验证两个推荐服务的推荐效果

灰度发布可以保证整体系统的稳定,在初始灰度的时候就可以发现、调整问题,以保证其影响度。

ambassador介绍

ambassador[æmˈbæsədər],是Kubernetes微服务 API gateway,基于Envoy Proxy。

Open Source Kubernetes-Native API Gateway built on the Envoy Proxy

官方地址:

https://www.getambassador.io/

部署ambassador

按官网提示部署ambassador

cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Service
metadata:
  labels:
    service: ambassador-admin
  name: ambassador-admin
spec:
  type: NodePort
  ports:
  - name: ambassador-admin
    port: 8877
    targetPort: 8877
  selector:
    service: ambassador
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: ambassador
rules:
- apiGroups: [""]
  resources: [ "endpoints", "namespaces", "secrets", "services" ]
  verbs: ["get", "list", "watch"]
- apiGroups: [ "getambassador.io" ]
  resources: [ "*" ]
  verbs: ["get", "list", "watch"]
- apiGroups: [ "apiextensions.k8s.io" ]
  resources: [ "customresourcedefinitions" ]
  verbs: ["get", "list", "watch"]
- apiGroups: [ "networking.internal.knative.dev" ]
  resources: [ "clusteringresses", "ingresses" ]
  verbs: ["get", "list", "watch"]
- apiGroups: [ "networking.internal.knative.dev" ]
  resources: [ "ingresses/status", "clusteringresses/status" ]
  verbs: ["update"]
- apiGroups: [ "extensions" ]
  resources: [ "ingresses" ]
  verbs: ["get", "list", "watch"]
- apiGroups: [ "extensions" ]
  resources: [ "ingresses/status" ]
  verbs: ["update"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ambassador
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: ambassador
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ambassador
subjects:
- kind: ServiceAccount
  name: ambassador
  namespace: kube-system
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: authservices.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: authservices
    singular: authservice
    kind: AuthService
    categories:
    - ambassador-crds
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: consulresolvers.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: consulresolvers
    singular: consulresolver
    kind: ConsulResolver
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: kubernetesendpointresolvers.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: kubernetesendpointresolvers
    singular: kubernetesendpointresolver
    kind: KubernetesEndpointResolver
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: kubernetesserviceresolvers.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: kubernetesserviceresolvers
    singular: kubernetesserviceresolver
    kind: KubernetesServiceResolver
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: mappings.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: mappings
    singular: mapping
    kind: Mapping
    categories:
    - ambassador-crds
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: modules.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: modules
    singular: module
    kind: Module
    categories:
    - ambassador-crds
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ratelimitservices.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: ratelimitservices
    singular: ratelimitservice
    kind: RateLimitService
    categories:
    - ambassador-crds
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tcpmappings.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: tcpmappings
    singular: tcpmapping
    kind: TCPMapping
    categories:
    - ambassador-crds
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tlscontexts.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: tlscontexts
    singular: tlscontext
    kind: TLSContext
    categories:
    - ambassador-crds
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tracingservices.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: tracingservices
    singular: tracingservice
    kind: TracingService
    categories:
    - ambassador-crds
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: logservices.getambassador.io
spec:
  group: getambassador.io
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: logservices
    singular: logservice
    kind: LogService
    categories:
    - ambassador-crds
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ambassador
spec:
  replicas: 3
  selector:
    matchLabels:
      service: ambassador
  template:
    metadata:
      annotations:
        sidecar.istio.io/inject: "false"
        "consul.hashicorp.com/connect-inject": "false"
      labels:
        service: ambassador
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  service: ambassador
              topologyKey: kubernetes.io/hostname
      serviceAccountName: ambassador
      containers:
      - name: ambassador
        image: quay.azk8s.cn/datawire/ambassador:0.86.1
        resources:
          limits:
            cpu: 1
            memory: 400Mi
          requests:
            cpu: 200m
            memory: 100Mi
        env:
        - name: AMBASSADOR_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        ports:
        - name: http
          containerPort: 8080
        - name: https
          containerPort: 8443
        - name: admin
          containerPort: 8877
        livenessProbe:
          httpGet:
            path: /ambassador/v0/check_alive
            port: 8877
          initialDelaySeconds: 30
          periodSeconds: 3
        readinessProbe:
          httpGet:
            path: /ambassador/v0/check_ready
            port: 8877
          initialDelaySeconds: 30
          periodSeconds: 3
        volumeMounts:
        - name: ambassador-pod-info
          mountPath: /tmp/ambassador-pod-info
      volumes:
      - name: ambassador-pod-info
        downwardAPI:
          items:
          - path: "labels"
            fieldRef:
              fieldPath: metadata.labels
      restartPolicy: Always
      securityContext:
        runAsUser: 8888
---
apiVersion: v1
kind: Service
metadata:
  name: ambassador
spec:
  type: NodePort
  externalTrafficPolicy: Local
  ports:
   - port: 80
     targetPort: 8080
  selector:
    service: ambassador


EOF

为了方便访问网关,生成一个ingress:


apiVersion: extensions/v1beta1
kind: Ingress
metadata:
 annotations:
   nginx.ingress.kubernetes.io/proxy-body-size: "0"
   nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
   nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
   kubernetes.io/tls-acme: 'true'
 name: ambassador
spec:
 rules:
 - host: ambassador.iflyresearch.com
   http:
     paths:
     - backend:
         serviceName: ambassador
         servicePort: 80
       path: /

ambassador 配置

ambassador 使用envoy来实现相关的负载,而envoy类似nginx。ambassador的原理大概是读取service里的配置,然后自动生成envoy的配置,当service变更时,动态更新envoy的配置并重启,所以ambassador需要可以访问服务API。

ambassador 的配置是放到metadata的annotations,以getambassador.io/config开头:

  annotations:
    getambassador.io/config: |
      ---
      apiVersion: ambassador/v0
      kind:  Mapping
      name:  {{ .Values.service.name }}_mapping
      prefix: /{{ .Values.service.prefix }}
      service: {{ .Values.service.name }}.{{ .Release.Namespace }}

profix指定如何访问服务,service指定指向那个服务。注意,需要加上namespace名称,否则容易报找不到后端。

ambassador 灰度

ambassador实现灰度可以根据weight权重,或者指定匹配特定的header来实现。

根据weight进行灰度

用法:

部署一个新版本的service,prefix和之前老服务保持一致,但是配置weight,比如20,这样20%的流量会流转到新服务,这样实现A/B Test

---
apiVersion: v1
kind: Service
metadata:
  name: svc-gray
  namespace: default
  annotations:
    getambassador.io/config: |
      ---
      apiVersion: ambassador/v0
      kind:  Mapping
      name:  svc1_mapping
      prefix: /svc/
      service: service-gray  weight: 20
spec:
  selector:
    app: testservice
  ports:
  - port: 8080
    name: service-gray
    targetPort: http-api

根据请求头 header 进行灰度 (regex_headers 正则匹配)

部署一个新版本,只需要特定的用户才能访问,可以通过该方案来实现。

例如:

---
apiVersion: v1
kind: Service
metadata:
  name: svc-gray
  namespace: default
  annotations:
    getambassador.io/config: |
      ---
      apiVersion: ambassador/v0
      kind:  Mapping
      name:  svc1_mapping
      prefix: /svc/
      service: service-gray  headers:
        gray: true
spec:
  selector:
    app: testservice
  ports:
  - port: 8080
    name: service-gray
    targetPort: http-api

访问时,当指定gray: true时,访问灰度版本,可以用postman来测试:

POSTMAN


作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: JAVA 使用jgit管理git仓库

最近设计基于gitops新的CICD方案,需要通过java读写git仓库,这里简单记录下。

JGit是一款pure java的软件包,可以读写git仓库,下面介绍基本使用。

引入jgit

maven引入:

        <!-- https://mvnrepository.com/artifact/org.eclipse.jgit/org.eclipse.jgit -->
        <dependency>
            <groupId>org.eclipse.jgit</groupId>
            <artifactId>org.eclipse.jgit</artifactId>
            <version>5.6.0.201912101111-r</version>
        </dependency>

jgit 有一个Git类,可以用来执行常规的git操作

凭证管理

通过CredentialsProvider管理凭证,常用的是UsernamePasswordCredentialsProvider

通过下面代码初始化:

public static CredentialsProvider createCredential(String userName, String password) {
        return new UsernamePasswordCredentialsProvider(userName, password);
    }

clone远程仓库

git 命令:

git clone {repoUrl}

通过Git.cloneRepository 来clone远程仓库,如果需要凭证,则需要指定credentialsProvider

 public static Git fromCloneRepository(String repoUrl, String cloneDir, CredentialsProvider provider) throws GitAPIException {
        Git git = Git.cloneRepository()
            .setCredentialsProvider(provider)
            .setURI(repoUrl)
            .setDirectory(new File(cloneDir)).call();
        return git;
    }

commit

git 命令:

git commit -a -m '{msg}'

commit比较简单,对应commit方法, 注意需要先add

    public static void commit(Git git, String message, CredentialsProvider provider) throws GitAPIException {
        git.add().addFilepattern(".").call();
        git.commit()
            .setMessage(message)
            .call();
    }

push

git 命令:

git push origin branchName

push直接调用push即可, 需要指定credentialsProvider

   public static void push(Git git, CredentialsProvider provider) throws GitAPIException, IOException {
        push(git,null,provider);
    }

    public static void push(Git git, String branch, CredentialsProvider provider) throws GitAPIException, IOException {
        if (branch == null) {
            branch = git.getRepository().getBranch();
        }
        git.push()
            .setCredentialsProvider(provider)
            .setRemote("origin").setRefSpecs(new RefSpec(branch)).call();
    }

读取已有仓库

如果git已经clone了,想直接读取,怎么办?

 public static Repository getRepositoryFromDir(String dir) throws IOException {
        return new FileRepositoryBuilder()
            .setGitDir(Paths.get(dir, ".git").toFile())
            .build();
    }

读取仓库日志

可以通过RevWalk读取仓库日志。

  • revWalk.parseCommit 可读取一条commit
  • 遍历revWalk,可读取所有日志
 public static List<String> getLogs(Repository repository) throws IOException {
        return getLogsSinceCommit(repository, null, null);
    }

    public static List<String> getLogsSinceCommit(Repository repository, String commit) throws IOException {
        return getLogsSinceCommit(repository, null, commit);
    }

    public static List<String> getLogsSinceCommit(Repository repository, String branch, String commit) throws IOException {
        if (branch == null) {
            branch = repository.getBranch();
        }
        Ref head = repository.findRef("refs/heads/" + branch);
        List<String> commits = new ArrayList<>();
        if (head != null) {
            try (RevWalk revWalk = new RevWalk(repository)) {
                revWalk.markStart(revWalk.parseCommit(head.getObjectId()));
                for (RevCommit revCommit : revWalk) {
                    if (revCommit.getId().getName().equals(commit)) {
                        break;
                    }
                    commits.add(revCommit.getFullMessage());
                    System.out.println("\nCommit-Message: " + revCommit.getFullMessage());
                }
                revWalk.dispose();
            }
        }

        return commits;
    }

测试

我们来先clone仓库,然后修改,最后push

 String yaml = "dependencies:\n" +
            "- name: springboot-rest-demo\n" +
            "  version: 0.0.5\n" +
            "  repository: http://hub.hubHOST.com/chartrepo/ainote\n" +
            "  alias: demo\n" +
            "- name: exposecontroller\n" +
            "  version: 2.3.82\n" +
            "  repository: http://chartmuseum.jenkins-x.io\n" +
            "  alias: cleanup\n";
        CredentialsProvider provider = createCredential("USR_NAME", "PASSWORD");

        String cloneDir = "/tmp/test";

        Git git = fromCloneRepository("http://gitlab.GITHOST.cn/env-test.git", cloneDir, provider);

        // 修改文件

        FileUtils.writeStringToFile(Paths.get(cloneDir, "env", "requirements.yaml").toFile(), yaml, "utf-8");      // 提交
        commit(git, "deploy(app): deploy  springboot-rest-demo:0.0.5 to env test", provider);
        // push 到远程仓库
        push(git, "master", provider);

        git.clean().call();
        git.close();

        FileUtils.deleteDirectory(new File(cloneDir));

读取已有仓库的日志:

        Repository repository = getRepositoryFromDir("GIT_DIR");
        List<String> logs = getLogs(repository);
        System.out.println(logs);

        RevCommit head = getLastCommit(repository);
        System.out.println(head.getFullMessage());

小结

本文讲述了如何通过jgit完成常规的git操作。


作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: JAVA使用SnakeYAML解析与序列化YAML

1.概述

本文,我们将学习如何使用SnakeYAML库将
YAML文档转换为Java对象,以及JAVA对象如何序列化为YAML文档

2.项目设置

要在项目中使用SnakeYAML,需要添加Maven依赖项(可在此处找到最新版本):

<dependency>
    <groupId>org.yaml</groupId>
    <artifactId>snakeyaml</artifactId>
    <version>1.25</version>
</dependency>

3.入口点

YAML类是API的入口点:

Yaml yaml = new Yaml()

由于实现不是线程安全的,因此不同的线程必须具有自己的Yaml实例。

4.加载YAML文档

SnakeYAML支持从StringInputStream加载文档,我们从定义一个简单的YAML文档开始,然后将文件命名为customer.yaml

firstName: "John"
lastName: "Doe"
age: 20

4.1。基本用法

现在,我们将使用Yaml类来解析上述YAML文档:

Yaml yaml = new Yaml();
InputStream inputStream = this.getClass()
  .getClassLoader()
  .getResourceAsStream("customer.yaml");
Map<String, Object> obj = yaml.load(inputStream);
System.out.println(obj);

上面的代码生成以下输出:

{firstName=John, lastName=Doe, age=20}

默认情况下,load()方法返回一个Map对象。查询Map对象时,我们需要事先知道属性键的名称,否则容易出错。更好的办法是自定义类型。

4.2自定义类型解析

SnakeYAML提供了一种将文档解析为自定义类型的方法

让我们定义一个Customer类,然后尝试再次加载该文档:

public class Customer {
 
    private String firstName;
    private String lastName;
    private int age;
 
    // getters and setters
}

现在我么来加载:

Yaml yaml = new Yaml();
InputStream inputStream = this.getClass()
 .getClassLoader()
 .getResourceAsStream("customer.yaml");
Customer customer = yaml.load(inputStream);

还有一种方法是使用Constructor:

Yaml yaml = new Yaml(new Constructor(Customer.class));

4.3。隐式类型

如果没有为给定属性定义类型,则库会自动将值转换为隐式type

例如:

1.0 -> Float
42 -> Integer
2009-03-30 -> Date

让我们使用一个TestCase来测试这种隐式类型转换:

@Test
public void whenLoadYAML_thenLoadCorrectImplicitTypes() {
   Yaml yaml = new Yaml();
   Map<Object, Object> document = yaml.load("3.0: 2018-07-22");
  
   assertNotNull(document);
   assertEquals(1, document.size());
   assertTrue(document.containsKey(3.0d));   
}

4.4 嵌套对象

SnakeYAML 支持嵌套的复杂类型。

让我们向“ customer.yaml”添加“ 联系方式”  和“ 地址” 详细信息并将新文件另存为customer_with_contact_details_and_address.yaml.

现在,我们将分析新的YAML文档:

firstName: "John"
lastName: "Doe"
age: 31
contactDetails:
   - type: "mobile"
     number: 123456789
   - type: "landline"
     number: 456786868
homeAddress:
   line: "Xyz, DEF Street"
   city: "City Y"
   state: "State Y"
   zip: 345657

我们来更新java类:

public class Customer {
    private String firstName;
    private String lastName;
    private int age;
    private List<Contact> contactDetails;
    private Address homeAddress;    
    // getters and setters
}

public class Contact {
    private String type;
    private int number;
    // getters and setters
}

public class Address {
    private String line;
    private String city;
    private String state;
    private Integer zip;
    // getters and setters
}

现在,我们来测试下Yamlload()

@Test
public void
  whenLoadYAMLDocumentWithTopLevelClass_thenLoadCorrectJavaObjectWithNestedObjects() {
  
    Yaml yaml = new Yaml(new Constructor(Customer.class));
    InputStream inputStream = this.getClass()
      .getClassLoader()
      .getResourceAsStream("yaml/customer_with_contact_details_and_address.yaml");
    Customer customer = yaml.load(inputStream);
  
    assertNotNull(customer);
    assertEquals("John", customer.getFirstName());
    assertEquals("Doe", customer.getLastName());
    assertEquals(31, customer.getAge());
    assertNotNull(customer.getContactDetails());
    assertEquals(2, customer.getContactDetails().size());
     
    assertEquals("mobile", customer.getContactDetails()
      .get(0)
      .getType());
    assertEquals(123456789, customer.getContactDetails()
      .get(0)
      .getNumber());
    assertEquals("landline", customer.getContactDetails()
      .get(1)
      .getType());
    assertEquals(456786868, customer.getContactDetails()
      .get(1)
      .getNumber());
    assertNotNull(customer.getHomeAddress());
    assertEquals("Xyz, DEF Street", customer.getHomeAddress()
      .getLine());
}

4.5。类型安全的集合

当给定Java类的一个或多个属性是泛型集合类时,需要通过TypeDescription来指定泛型类型,以以便可以正确解析。

让我们假设一个 一个Customer拥有多个Contact

firstName: "John"
lastName: "Doe"
age: 31
contactDetails:
   - { type: "mobile", number: 123456789}
   - { type: "landline", number: 123456789}

为了能正确解析,**我们可以在顶级类上为给定属性指定TypeDescription **:

Constructor constructor = new Constructor(Customer.class);
TypeDescription customTypeDescription = new TypeDescription(Customer.class);
customTypeDescription.addPropertyParameters("contactDetails", Contact.class);
constructor.addTypeDescription(customTypeDescription);
Yaml yaml = new Yaml(constructor);

4.6。载入多个文件

在某些情况下,单个文件中可能有多个YAML文档,而我们想解析所有文档。所述YAML类提供了一个LOADALL()方法来完成这种类型的解析。

假设下面的内容在一个文件中:

---
firstName: "John"
lastName: "Doe"
age: 20
---
firstName: "Jack"
lastName: "Jones"
age: 25

我们可以使用loadAll()方法解析以上内容,如以下代码示例所示:

@Test
public void whenLoadMultipleYAMLDocuments_thenLoadCorrectJavaObjects() {
    Yaml yaml = new Yaml(new Constructor(Customer.class));
    InputStream inputStream = this.getClass()
      .getClassLoader()
      .getResourceAsStream("yaml/customers.yaml");
 
    int count = 0;
    for (Object object : yaml.loadAll(inputStream)) {
        count++;
        assertTrue(object instanceof Customer);
    }
    assertEquals(2,count);
}

5.生成YAML文件

SnakeYAML 支持 将java对象序列化为yml。

5.1。基本用法

我们将从一个将Map <String,Object>的实例转储到YAML文档(String)的简单示例开始:

@Test
public void whenDumpMap_thenGenerateCorrectYAML() {
    Map<String, Object> data = new LinkedHashMap<String, Object>();
    data.put("name", "Silenthand Olleander");
    data.put("race", "Human");
    data.put("traits", new String[] { "ONE_HAND", "ONE_EYE" });
    Yaml yaml = new Yaml();
    StringWriter writer = new StringWriter();
    yaml.dump(data, writer);
    String expectedYaml = "name: Silenthand Olleander\nrace: Human\ntraits: [ONE_HAND, ONE_EYE]\n";
 
    assertEquals(expectedYaml, writer.toString());
}

上面的代码产生以下输出(请注意,使用LinkedHashMap的实例将保留输出数据的顺序):

name: Silenthand Olleander
race: Human
traits: [ONE_HAND, ONE_EYE]

5.2。自定义Java对象

我们还可以选择将自定义Java类型转储到输出流中

@Test
public void whenDumpACustomType_thenGenerateCorrectYAML() {
    Customer customer = new Customer();
    customer.setAge(45);
    customer.setFirstName("Greg");
    customer.setLastName("McDowell");
    Yaml yaml = new Yaml();
    StringWriter writer = new StringWriter();
    yaml.dump(customer, writer);        
    String expectedYaml = "!!com.baeldung.snakeyaml.Customer {age: 45, contactDetails: null, firstName: Greg,\n  homeAddress: null, lastName: McDowell}\n";
 
    assertEquals(expectedYaml, writer.toString());
}

生成内容会包含!!com.baeldung.snakeyaml.Customer,为了避免在输出文件中使用标签名,我们可以使用库提供的  dumpAs()方法。

因此,在上面的代码中,我们可以进行以下调整以删除标记:

yaml.dumpAs(customer, Tag.MAP, null);

六 结语

本文说明了SnakeYAML库解析和序列化YAML文档。

所有示例都可以在GitHub项目中找到。

附录


作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: 通过 Drone Rest API 获取构建记录日志


Drone是一款CICD工具,提供rest API,简单介绍下如何使用API 获取构建日志。

获取token

登录进入drone,点头像,在菜单里选择token

enter description here

复制token即可

API 介绍

Drone的api分为几大类

  • Builds 构建
  • Cron 定时任务
  • Repos 仓库
  • Secrets
  • User 用户
  • Users

调用举例:

enter description here

Build API

构建列表(Build List)

获取仓库的最新构建:

GET /api/repos/{owner}/{repo}/builds

curl -i http://drone.YOUR_HOST.cn/api/repos/jqpeng/springboot-rest-demo/builds -H "Authorization: Bearer TOKEN"

响应正文示例:

[
  {
      "id": 100207,
      "repo_id": 296163,
      "number": 42,
      "status": "success",
      "event": "pull_request",
      "action": "sync",
      "link": "https://github.com/octoat/hello-world/compare/e3320539a4c0...9fc1ad6ebf12",
      "message": "updated README",
      "before": "e3320539a4c03ccfda992641646deb67d8bf98f3",
      "after": "9fc1ad6ebf12462f3f9773003e26b4c6f54a772e",
      "ref": "refs/heads/master",
      "source_repo": "spaceghost/hello-world",
      "source": "develop",
      "target": "master",
      "author_login": "octocat",
      "author_name": "The Octocat",
      "author_email": "octocat@github.com",
      "author_avatar": "http://www.gravatar.com/avatar/7194e8d48fa1d2b689f99443b767316c",
      "sender": "bradrydzewski",
      "started": 1564085874,
      "finished": 1564086343,
      "created": 1564085874,
      "updated": 1564085874,
      "version": 3
  }
]

构建详情

通过该接口获取构建详情,返回构建状态等信息,{build} 为上面列表里的number,既构建序号。

GET /api/repos/{owner}/{repo}/builds/{build}

Example Response Body:

{
    "id": 39862,
    "number": 20,
    "parent": 0,
    "event": "push",
    "status": "success",
    "error": "",
    "enqueued_at": 1576636849,
    "created_at": 1576636849,
    "started_at": 1576636850,
    "finished_at": 1576639053,
    "deploy_to": "",
    "commit": "7729006bfe11933da6c564101acaf8c7f78c5f62",
    "branch": "master",
    "ref": "refs/heads/master",
    "refspec": "",
    "remote": "",
    "title": "",
    "message": "通过update更新\n",
    "timestamp": 0,
    "sender": "",
    "author": "jqpeng",
    "author_avatar": "https://www.gravatar.com/avatar/4ab53b564545f18efc4079c30a2d35cf.jpg?s=128",
    "link_url": "",
    "signed": false,
    "verified": true,
    "reviewed_by": "",
    "reviewed_at": 0,
    "procs": [
        {
            "id": 247912,
            "build_id": 39862,
            "pid": 1,
            "ppid": 0,
            "pgid": 1,
            "name": "",
            "state": "success",
            "exit_code": 0,
            "start_time": 1576636850,
            "end_time": 1576639053,
            "machine": "21e73ce43038",
            "children": [
                {
                    "id": 247913,
                    "build_id": 39862,
                    "pid": 2,
                    "ppid": 1,
                    "pgid": 2,
                    "name": "clone",
                    "state": "success",
                    "exit_code": 0,
                    "start_time": 1576636853,
                    "end_time": 1576636933,
                    "machine": "21e73ce43038"
},
                {
                    "id": 247914,
                    "build_id": 39862,
                    "pid": 3,
                    "ppid": 1,
                    "pgid": 3,
                    "name": "build",
                    "state": "success",
                    "exit_code": 0,
                    "start_time": 1576636933,
                    "end_time": 1576636998,
                    "machine": "21e73ce43038"
}
]
}
]
}

procs 是构建的步骤,记住pid,获取构建日志有用

构建日志

获取构建日志,需要传入{log} 和 {pid}, log是上面的{build},{pid}是上一步返回的pid

GET /api/repos/{owner}/{repo}/logs/{log}/{pid}

响应正文示例:

[
  {
    "proc": "clone",
    "pos": 0,
    "out": "+ git init\n"
  },
  {
    "proc": "clone",
    "pos": 1,
    "out": "Initialized empty Git repository in /drone/src/github.com/octocat/hello-world/.git/\n"
  },
  {
    "proc": "clone",
    "pos": 2,
    "out": "+ git remote add origin https://github.com/octocat/hello-world.git\n"
  },
  {
    "proc": "clone",
    "pos": 3,
    "out": "+ git fetch --no-tags origin +refs/heads/master:\n"
  },
  {
    "proc": "clone",
    "pos": 4,
    "out": "From https://github.com/octocat/hello-world\n"
  },
  {
    "proc": "clone",
    "pos": 5,
    "out": " * branch            master     -> FETCH_HEAD\n"
  },
  {
    "proc": "clone",
    "pos": 6,
    "out": " * [new branch]      master     -> origin/master\n"
  },
  {
    "proc": "clone",
    "pos": 7,
    "out": "+ git reset --hard -q 62126a02ffea3dabd7789e5c5407553490973665\n"
  },
  {
    "proc": "clone",
    "pos": 8,
    "out": "+ git submodule update --init --recursive\n"
  }
]

作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: fastText训练word2vec并用于训练任务

最近测试OpenNRE,没有GPU服务器,bert的跑不动,于是考虑用word2vec,捡起fasttext

下载安装

先clone代码

git clone https://github.com/facebookresearch/fastText.git

然后make编译:

make

编译后,将生成的fastText移到bin

cp fasttext /usr/local/bin/

训练word2vec

先讲语料分好词,比如保存到sent_train.txt,文件内容是中文分词后的内容:

楚穆王 十二年 : ( 丁未 , 公元前 614 年 ) , 在位 12 年 的 楚穆王 死 , 死后 葬 在 楚 郢 之西 。

开始调用fasttext训练:

fasttext skipgram -input sent_train.txt -output ./result

很快就跑完了,跑完后,可以看到生成两个文件:

# ll result.*
-rw-r--r-- 1 root root 876945058 Nov 21 09:29 result.bin
-rw-r--r-- 1 root root  82656362 Nov 21 09:29 result.vec

来看下vec文件,可以看到是100维的向量

# head result.vec 
94161 100
, 0.030437 0.12177 -0.1367 0.11101 -0.49543 0.49908 0.033441 -0.025445 -0.036312 -0.081132 -0.082666 0.27204 -0.2367 -0.23424 -0.30124 -0.029666 -0.23803 -0.083255 -0.03177 -0.23129 0.33953 -0.095728 0.023824 -0.33981 -0.048715 -0.22876 0.24215 -0.094075 -0.077224 0.097473 -0.012714 -0.16661 -0.5156 -0.12635 -0.34265 -0.13444 -0.25535 -0.29832 0.14624 -0.24779 0.25403 0.17662 0.070345 -0.15927 0.3449 0.11372 0.22504 0.15652 0.19013 -0.029641 -0.1761 0.018512 -0.19782 -0.15607 0.39958 0.31343 0.30654 0.062457 -0.045659 -0.008893 0.11445 0.035771 0.048592 0.17336 0.15742 -0.085562 -0.12398 0.25767 -0.087141 -0.10011 -0.14832 0.11072 0.0037114 0.18156 -0.32666 -0.081212 0.1102 -0.035646 0.09467 0.014385 0.11191 -0.14713 -0.0052515 -0.006049 0.3735 -0.13804 0.12271 -0.050977 -0.019325 -0.034865 0.019665 -0.16755 0.034194 0.074825 0.16173 -0.20006 -0.03904 -0.18061 0.040119 -0.22622 
</s> -0.23353 0.071758 -0.26913 -0.14217 -0.1736 0.22807 -0.11152 -0.047725 0.19557 0.13388 -0.21704 0.39025 -0.30286 0.16748 -0.18748 0.11423 -0.19393 -0.10635 -0.12826 -0.3244 0.27615 -0.25832 -0.17595 -0.12634 -0.094196 -0.19782 0.14435 -0.059313 -0.24001 -0.13996 -0.09501 -0.26155 -0.35677 0.059324 -0.23963 -0.20722 -0.37483 -0.11253 0.021369 -0.15571 0.059181 0.33843 -0.058266 -0.12393 0.17777 -0.032558 0.17864 0.28223 0.058037 0.13108 -0.31817 0.081199 -0.05605 -0.029366 0.30827 0.3208 0.070286 0.062643 0.0040956 -0.080481 0.0064075 -0.087952 0.19877 0.33604 0.28209 0.073563 -0.097628 0.035748 -0.20385 -0.28676 -0.12122 0.10025 -0.05521 0.22991 -0.326 0.062162 0.090364 -0.20831 0.3678 -0.00043566 0.059466 -0.068502 -0.072635 0.08424 0.56188 -0.2588 0.15091 -0.15923 -0.12595 0.086243 0.08293 -0.37854 0.055448 0.11274 0.19559 -0.17132 -0.0858 -0.072667 -0.10356 0.1394 
。 -0.097674 0.13847 -0.25706 0.057651 -0.29097 0.23529 -0.022163 -0.046278 0.18153 0.15302 -0.25117 0.30537 -0.22519 0.072574 -0.20933 -0.012315 -0.098127 -0.13748 -0.10589 -0.24647 0.38788 -0.077517 -0.24665 -0.1512 -0.17301 -0.13699 0.15931 -0.050389 -0.20344 -0.10393 -0.086151 -0.12502 -0.51355 -0.078194 -0.32112 -0.25169 -0.55924 -0.083918 0.13193 -0.12648 0.030666 0.37635 -0.0068401 -0.082757 0.35918 0.099755 0.048127 0.14651 -0.078756 0.16794 -0.35228 0.096068 -0.083268 -0.16416 0.30675 0.2112 -0.0034745 0.06171 0.015094 0.035436 0.13429 -0.036958 0.052708 0.39273 0.15883 0.015595 -0.014254 0.025274 -0.061765 -0.20447 -0.11626 0.12291 -0.13875 0.28874 -0.46607 -0.0064296 0.09502 -0.19274 0.32262 -0.077533 0.058291 -0.11019 -0.23094 -0.023817 0.59467 -0.31411 0.13071 -0.064146 -0.10452 -0.014019 0.28547 -0.3112 -0.019938 0.073268 0.2858 -0.087934 -0.0038124 -0.032765 -0.086257 0.07277 
的 -0.32055 0.26205 -0.12693 0.036473 -0.48332 0.37801 0.043741 0.063979 0.17719 -0.0034521 -0.28247 0.4286 -0.11431 0.02168 -0.2469 -0.34261 -0.29886 -0.11997 -0.068971 -0.084678 0.36461 -0.089133 -0.056445 -0.15533 0.00017123 0.1496 0.24858 0.12394 -0.23362 -0.26373 0.037876 -0.20656 -0.48941 0.093864 -0.21763 -0.35464 -0.31409 -0.10626 0.064067 -0.21431 0.028025 0.27952 -0.1368 -0.30315 0.39894 0.23021 0.19465 0.12281 0.22508 -0.14596 -0.31362 -0.035584 0.076387 -0.28307 0.32819 0.21772 0.2417 0.23587 -0.097756 -0.18368 -0.027078 -0.15416 0.095119 -0.16597 0.096744 0.20759 0.083306 0.24435 -0.055484 -0.17169 -0.031104 0.13582 0.15192 0.066508 -0.19847 -0.28637 0.027218 -0.030856 0.36561 -0.13589 0.26368 -0.13762 -0.21137 -0.24706 0.46078 -0.31472 0.080658 0.23818 -0.060492 0.18232 0.19158 -0.16032 0.14793 0.021469 0.22363 -0.20411 0.07628 -0.096523 -0.11407 -0.35992 

转换为pytorch可加载格式

为了方便训练使用,需要转换下:

import pickle as pkl
import numpy as np
import os
import json

def create_wordVec(vec_file,word2id,vec):
    word_map = {}
    word_map['PAD'] = len(word_map)
    word_map['UNK'] = len(word_map)
    word_embed = []
    for line in open(vec_file):
        content = line.strip().split()
        if len(content) != 100 + 1:
            continue
        word_map[content[0]] = len(word_map)
        word_embed.append(np.asarray(content[1:], dtype=np.float32))
    word_embed = np.array(word_embed)
    np.save(vec,word_embed)
    with open(word2id, 'w+', encoding='utf-8') as fw:
      fw.write(json.dumps(word_map, ensure_ascii=False))

create_wordVec('result.vec','word2id.json','word2vec.npy')

训练模型

参考opennre的cnn分类代码:

import torch
import numpy as np
import json
import opennre
from opennre import encoder, model, framework

print('load word2vec ...')
ckpt = 'ckpt/semeval_cnn_softmax.pth.tar'
wordi2d = json.load(open('pretrain/glove/word2id.json'))
word2vec = np.load('pretrain/glove/word2vec.npy')
rel2id = json.load(open('benchmark/ccks/ccks_rel2id.txt'))
print('create model ...')
sentence_encoder = opennre.encoder.CNNEncoder(token2id=wordi2d,
                                             max_length=100,
                                             word_size=100,
                                             position_size=5,
                                             hidden_size=230,
                                             blank_padding=True,
                                             kernel_size=3,
                                             padding_size=1,
                                             word2vec=word2vec,
                                             dropout=0.5)
model = opennre.model.SoftmaxNN(sentence_encoder, len(rel2id), rel2id)
framework = opennre.framework.SentenceRE(
    train_path='benchmark/ccks/ccks_train.txt',
    val_path='benchmark/ccks/ccks_dev.txt',
    test_path='benchmark/ccks/ccks_dev.txt',
    model=model,
    ckpt=ckpt,
    batch_size=32,
    max_epoch=100,
    lr=0.1,
    weight_decay=1e-5,
    opt='sgd')
# Train
#framework.train_model(metric='micro_f1')
# Test
print('load model ...')

framework.load_state_dict(torch.load(ckpt)['state_dict'])
framework.train_model(metric='micro_f1')
result = framework.eval_model(framework.test_loader)
print('Accuracy on test set: {}'.format(result['acc']))
print('Micro Precision: {}'.format(result['micro_p']))
print('Micro Recall: {}'.format(result['micro_r']))
print('Micro F1: {}'.format(result['micro_f1']))

作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

文章作者:jqpeng
原文链接: NLP标注工具brat 配置文件说明

快速搭建brat

通过docker:

docker run --name=brat -d -p 38080:80 -e BRAT_USERNAME=brat -e BRAT_PASSWORD=brat -e BRAT_EMAIL=brat@example.com cassj/brat

启动会拉取镜像,耐心等待,然后打开IP:38080,使用brat,brat登录

braf 的四类配置文件

the configuration of an annotation project is controlled by four files:

  • annotation.conf: 标记类型 configuration
  • visual.conf: annotation显示配置
  • tools.conf: annotation工具配置
  • kb_shortcuts.conf: 键盘快捷键 keyboard shortcut tool configuration

annotation.conf

标记配置文件

# 实体类型
[entities]
# 每行一个实体类型  
Protein
Simple_chemical
Complex
Organism

# 事件
[events]

# 事件名称  参数名称:参数类型
Gene_expression Theme:Protein
Binding Theme+:Protein
Positive_regulation Theme:<EVENT>|Protein, Cause?:<EVENT>|Protein
Negative_regulation Theme:<EVENT>|Protein, Cause?:<EVENT>|Protein

# 关系
[relations]

# 关系名称 关系的属性,syntax ARG:TYPE (where ARG are, by convention, Arg1 and Arg2)
Part-of Arg1:Protein, Arg2:Complex
Member-of Arg1:Protein, Arg2:Complex

# TODO: Should these really be called "Equivalent" instead of "Equiv"?
Equiv Arg1:Protein, Arg2:Protein, <REL-TYPE>:symmetric-transitive
Equiv Arg1:Simple_chemical, Arg2:Simple_chemical, <REL-TYPE>:symmetric-transitive
Equiv Arg1:Organism, Arg2:Organism, <REL-TYPE>:symmetric-transitive

# 属性定义
[attributes]

# 名称  参数
Negation        Arg:<EVENT>
Confidence        Arg:<EVENT>, Value:Possible|Likely|Certain

Visual configuration (visual.conf)

可视化configuration包含两部分

  • [labels]
  • [drawing]

The [labels] 定义标记类型UI上如何显示:

Simple_chemical | Simple chemical | Chemical
标记类型  |   全称  |  显示文字

使用”|”隔开,第一部分是里定义的

The [drawing] 用于定义显示样式,比如定义标记的颜色等

[labels]


Simple_chemical | Simple chemical | Chemical
Protein | Protein
Complex | Complex
Organism | Organism

Gene_expression | Gene expression | Expression | Expr
Binding | Binding
Regulation | Regulation
Positive_regulation | Positive regulation | +Regulation
Negative_regulation | Negative regulation | -Regulation
Phosphorylation | Phosphorylation | Phos


Equiv | Equiv

Theme  | Theme
Cause  | Cause
Participant | Participant

[drawing]


SPAN_DEFAULT    fgColor:black, bgColor:lightgreen, borderColor:darken
ARC_DEFAULT    color:black, arrowHead:triangle-5
ATTRIBUTE_DEFAULT    glyph:*

Protein    bgColor:#7fa2ff
Simple_chemical    bgColor:#8fcfff
Complex    bgColor:#8f97ff
Organism    bgColor:#ffccaa

Positive_regulation    bgColor:#e0ff00
Regulation    bgColor:#ffff00
Negative_regulation    bgColor:#ffe000

Cause    color:#007700
Equiv    dashArray:3-3, arrowHead:none

Negation    box:crossed, glyph:<NONE>, dashArray:<NONE>
Confidence    dashArray:3-6|3-3|-, glyph:<NONE>

工具栏配置 (tools.conf)

The annotation tool configuration file, tools.conf, is divided into the following sections:

  • [options]
  • [search]
  • [normalization]
  • [annotators]
  • [disambiguators]

These sections are all optional: an empty file is a vali

Option configuration ([options] section)

[options] 用来配置服务端如何处理分词、分局、验证、日志等:

  • Tokens tokenizer:VALUE, where VALUE=
    • whitespace: split by whitespace characters in source text (only)
    • ptblike: emulate Penn Treebank tokenization
    • mecab: perform Japanese tokenization using MeCab
  • Sentences splitter:VALUE, where VALUE=
    • regex: regular expression-based sentence splitting
    • newline: split by newline characters in source text (only)
  • Validation validate:VALUE, where VALUE=
    • all: perform full validation
    • none: don’t perform any validation
  • Annotation-log logfile:VALUE, where VALUE=
    • <NONE>: no annotation logging
    • NAME: log into file NAME (e.g. “/home/brat/work/annotation.log”)

For example, the following [options] section gives the default brat configuration before v1.3:

|

[options]
Tokens tokenizer:whitespace
Sentences splitter:regex
Validation validate:none
Annotation-log logfile:

The following [options] section enables Japanese tokenization using MeCab, sentence splitting only by newlines, full validation, and annotation logging into the given file. (In setting Annotation-log logfile, remember to make sure the web server has appropriate write permissions to the file.)

|

[options]
Tokens tokenizer:mecab
Sentences splitter:newline
Validation validate:all
Annotation-log logfile:/home/brat/work/annotation.log

Normalization DB configuration ([normalization] section)

The [normalization] section defines the normalization resources that are available. For information on setting up normalization DBs, see the brat normalization documentation.

Each line in the [normalization] section has the following syntax:

    DBNAME     DB:DBPATH, <URL>:HOMEURL, <URLBASE>:ENTRYURL

Here, DB<URL><URLBASE> and <PATH> are literal strings (they should appear as written here), while “DBNAME”, “DBPATH”, “HOMEURL” and “ENTRYURL” should be replaced with specific values appropriate for the database being configured:

  • DBNAME: sets the database name (e.g. “Wiki”, “GO”). The name can be otherwise freely selected, but should not contain characters other than alphanumeric (“a”-“z”, “A”-“Z”, “0”-“9”), hyphen (“-“) and underscore (“_“). This name will be used both in the brat UI and in the annotation file to identify the DB.
  • DBPATH (optional): provides the file system path to the normalization DB data on the server, relative to the brat server root. If DBPATH isn’t set, the system assumes the DB can be found in the default location under the given DBNAME.
  • HOMEURL: sets the URL for the home page of the normalization resource (e.g. “http://en.wikipedia.org/wiki/“). Used both to identify the resource more specifically than DBNAME and to provide a link in the annotation UI for accessing the resource.
  • URLBASE (optional): sets a URL template (e.g. “http://en.wikipedia.org/?curid=%s“) that can be filled in to generate a direct link in the annotation UI to an entry in the normalization resource. The value should contain the characters “%s” as a placeholder that will be replaced with the ID of the entry.

The following example shows examples of configured normalization DBs.

|

[normalization]
Wiki DB:dbs/wiki, :http://en.wikipedia.org, :http://en.wikipedia.org/?curid=%s
UniProt :http://www.uniprot.org/, :http://www.uniprot.org/uniprot/%s

The first line sets configuration for a database called “Wiki”, found as “dbs/wiki” in the brat server directory, and the second for a DB called “UniProt”, found in the default location for a DB with this name.

搜索配置 ([search] section)

The [search] 用来配置在线搜索,这样选中一个词语后,可以点击搜索链接进行搜索。

Each line in the [search] section contains the name used in the user interface for the search service, and a single key:value pair. The key should have the special value ““ and its value should be the URL URL of the search service with the string to query for replaced by ”%s”.

The following example shows a simple [search] section.

|

[search]
Google :http://www.google.com/search?q=%s
Wikipedia :http://en.wikipedia.org/wiki/%s

When selecting a span or editing an annotation, these search options will then be shown in the brat annotation dialog.

Annotation tool configuration ([annotators] section)

The [annotators] section defines automatic annotation services that can be invoked from brat.

Each line in the [annotators] section contains a unique name for the service and key:value pairs defining the way it is presented in the user interface and the URL of the web service for the tool. Values should be given for “tool”, “model” and ““ (the first two are used for the user interface only).

The following example shows a simple [annotators] section.

|

[annotators]
SNER-CoNLL tool:Stanford_NER, model:CoNLL, :http://example.com:80/tagger/

Disambiguation tool configuration ([disambiguators] section)

The [disambiguators] section defines automatic semantic class (annotation type) disambiguation services that can be invoked from brat.

Each line in the [disambiguators] section contains a unique name for the service and key:value pairs defining the way it is presented in the user interface and the URL of the web service for the tool. Values should be given for “tool”, “model” and ““ (the first two are used for the user interface only).

The following example shows a simple [disambiguators] section.

|

[disambiguators]
simsem-MUC tool:simsem, model:MUC, :http://example.com:80/simsem/%s

As for search, the string to query for is identified by “%s” in the URL.

来看一个demo:

[options]

# Possible values for validate:
# - all: perform full validation
# - none: don't perform any validation
Validation    validate:all

# Possible values for tokenizer
# - ptblike: emulate Penn Treebank tokenization
# - mecab: perform Japanese tokenization using MeCab
# - whitespace: split by whitespace characters in source text (only) 
Tokens       tokenizer:whitespace

# Possible values for splitter:
# - regex  : regular expression-based sentence splitting
# - newline: split by newline characters in source text (only)
Sentences    splitter:newline

# Possible values for logfile:
# - <NONE> : no annotation logging
# - NAME : log into file NAME (e.g. "/home/brat/annotation.log")
Annotation-log logfile:<NONE>

[search]

# Search option configuration. Configured queries will be available in
# text span annotation dialogs. When selected on the UI, these open
# the given URL ("<URL>") with the string "%s" replaced with the
# selected text span.

Google       <URL>:http://www.google.com/search?q=%s
Wikipedia    <URL>:http://en.wikipedia.org/wiki/Special:Search?search=%s
UniProt      <URL>:http://www.uniprot.org/uniprot/?sort=score&query=%s
EntrezGene   <URL>:http://www.ncbi.nlm.nih.gov/gene?term=%s
GeneOntology <URL>:http://amigo.geneontology.org/cgi-bin/amigo/search.cgi?search_query=%s&action=new-search&search_constraint=term
ALC          <URL>:http://eow.alc.co.jp/%s

[annotators]

# Automatic annotation service configuration. The values of "tool" and
# "model" are required for the UI, and "<URL>" should be filled with
# the URL of the web service. See the brat documentation for more
# information.

# Examples:
# Random              tool:Random, model:Random, <URL>:http://localhost:47111/
# Stanford-CoNLL-MUC  tool:Stanford_NER, model:CoNLL+MUC, <URL>:http://127.0.0.1:47111/
# NERtagger-GENIA     tool:NERtagger, model:GENIA, <URL>:http://example.com:8080/tagger/

[disambiguators]

# Automatic semantic disambiguation service configuration. The values
# of "tool" and "model" are required for the UI, and "<URL>" should be
# filled with the URL of the web service. See the brat documentation
# for more information.

# Example:
# simsem-GENIA    tool:simsem, model:GENIA, <URL>:http://example.com:8080/tagger/%s

[normalization]

# Configuration for normalization against external resources. The
# resource name (first field of each line) should match that of a
# normalization DB on the brat server (see tools/norm_db_init.py),
# "<URL>" should be filled with the URL of the resource (preferably
# one providing a serach interface), and "<URLBASE>" should be a
# string containing "%s" that, when replacing "%s" with an ID in
# the external resource, becomes a link to a page representing
# the entry corresponding to the ID in that resource.

# Example
#UniProt    <URL>:http://www.uniprot.org/, <URLBASE>:http://www.uniprot.org/uniprot/%s
#GO    <URL>:http://www.geneontology.org/, <URLBASE>:http://amigo.geneontology.org/cgi-bin/amigo/term_details?term=GO:%s
#FMA    <URL>:http://fme.biostr.washington.edu/FME/index.html, <URLBASE>:http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=FMA&termId=FMA:%s

快捷键

选中标记后,键盘上按快捷键,可以快速切换选项

P       Protein
S Simple_chemical
X Complex
O Organism

C Cause
T Theme

作者:Jadepeng
出处:jqpeng的技术记事本–http://www.cnblogs.com/xiaoqi
您的支持是对博主最大的鼓励,感谢您的认真阅读。
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。