diff --git "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/PyTorch\347\246\273\347\272\277\346\216\250\347\220\206-FAQ.md" "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/PyTorch\347\246\273\347\272\277\346\216\250\347\220\206-FAQ.md"
index 521f3738395ea2d29253b369473bf771d80930b8..5a2f61dfae701307c5a24df34a5a1b92b8a35049 100644
--- "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/PyTorch\347\246\273\347\272\277\346\216\250\347\220\206-FAQ.md"
+++ "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/PyTorch\347\246\273\347\272\277\346\216\250\347\220\206-FAQ.md"
@@ -5,21 +5,24 @@
   - [2.1 如何查看 `ONNX/om/pbtxt` 模型](#21-如何查看-onnxompbtxt-模型)
   - [2.2 `Exporting the operator {opname} to ONNX opset version {version} is not supported.`](#22-exporting-the-operator-opname-to-onnx-opset-version-version-is-not-supported)
   - [2.3 resize不支持5d输入的解决方案](#23-resize不支持5d输入的解决方案)
+  - [2.4 循环（LOOP）结构](#24-循环（LOOP）结构)
 - [3 OM离线推理失败问题汇总](#3-om离线推理失败问题汇总)
   - [3.1 找不到atc命令或找不到ascend动态库](#31-找不到atc命令或找不到ascend动态库)
   - [3.2 模型推理工具常见的错误&&解决方案](#32-模型推理工具常见的错误解决方案)
   - [3.3 msame和benchmark在多batch下推理区别](#33-msame和benchmark在多batch下推理区别)
   - [3.4 msame/benchmark推理提示实际输入和om输入size不一致](#34-msamebenchmark推理提示实际输入和om输入size不一致)
   - [3.5 benchmark工具报错`ERROR Check free memory less 0.256 rate now wait!`](#35-benchmark工具报错error-check-free-memory-less-0256-rate-now-wait)
-- [4 精度调试常见问题](#4-精度调试常见问题)
-- [5 性能优化常见问题](#5-性能优化常见问题)
+- [4 精度相关问题](#4-精度相关问题)
+- [5 性能相关问题](#5-性能相关问题)
   - [5.1 如何使用AIPP进行性能提升](#51-如何使用aipp进行性能提升)
-- [5.2 GPU推理（trtexec）报错](#52-gpu推理trtexec报错)
+  - [5.2 GPU推理（trtexec）报错](#52-gpu推理trtexec报错)
   - [5.3 提升transpose的性能](#53-提升transpose的性能)
+  - [5.4 AICPU算子问题](#54-AICPU算子问题)
 # 1 介绍
   本文目标读者为Ascend PyTorch模型离线推理开发者，用于指导开发者在昇腾服务器的CANN软件环境中，实现模型离线推理精度性能达标。这里仅列举模型离线推理中遇到的常见问题与解决方法，持续更新。
 **FAQ上传格式**
 尽量以文本方式呈现，方便索引查找
+
 - 标题
   - 错误现象
   - 原因分析
@@ -70,11 +73,11 @@
 - 解决方案
 
   使用 [MagicONNX工具](https://gitee.com/Ronnie_zheng/MagicONNX/tree/master) 进行改图，这里给出该问题的改图代码，更多功能详见[使用教程](https://gitee.com/Ronnie_zheng/MagicONNX/blob/master/docs/tutorials.md)和[API说明](https://gitee.com/Ronnie_zheng/MagicONNX/blob/master/docs/operations.md)
+
   ```python
   import numpy as np
   from magiconnx import OnnxGraph
-
-
+  
   def modify(path):
       graph = OnnxGraph(path)
       resizes = graph.get_nodes("Resize")
@@ -86,7 +89,7 @@
           graph.add_initializer(f'shape_{node.name}', np.array(shapes[idx][0]))
           reshape1.inputs = [node.inputs[0], f'shape_{node.name}']
           reshape1.outputs = [f'Reshape_{node.name}']
-
+  
           graph[node.inputs[-1]].value = np.array(shapes[idx][1])
           out_name = node.outputs[0]
           node.set_input(0, f'Reshape_{node.name}')
@@ -108,12 +111,56 @@
           reshape3.outputs = [out_name]
     
       graph.save('modify.onnx')
-
+  
   if __name__ == "__main__":
       modify('src.onnx')
   ```
 
+## 2.4 循环（LOOP）结构
+
++ 问题现象
+
+  原始模型存在循环的Loop结构，导致静态模型节点不断堆叠，节点过多，onnx模型过大，无法打开。
+
++ 原因分析
+
+  动态模型推理时，因为模型为动态编译，且Loop结构中，通常为重复调用相同的操作，不会涉及内存资源的累计。如下面的采样结构：
+
+  ```python
+  def farthest_point_sample(xyz, npoint):
+      """
+      Input:
+          xyz: pointcloud data, [B, N, 3]
+          npoint: number of samples
+      Return:
+          centroids: sampled pointcloud index, [B, npoint]
+      """
+      device = xyz.device
+      B, N, C = xyz.shape
+      centroids = torch.zeros(B, npoint, dtype=torch.long).to(device)
+      distance = torch.ones(B, N).to(device) * 1e10
+      farthest = torch.randint(0, N, (B,), dtype=torch.long).to(device)
+      batch_indices = torch.arange(B, dtype=torch.long).to(device)
+      # 采样过程
+      for i in range(npoint):
+          centroids[:, i] = farthest
+          centroid = xyz[batch_indices, farthest, :].view(B, 1, 3)
+          dist = torch.sum((xyz - centroid) ** 2, -1)
+          mask = dist < distance
+          distance[mask] = dist[mask]
+          farthest = torch.max(distance, -1)[1]
+      return centroids
+  ```
+
++ 处理方案
+
+  可采用两种方案：
+
+  + 修改loop代码，一些简单的loop结构可以通过基本算子进行替换：如一些取值或则拼接操作可以通过indext/concat操作替换
+  + 分离模型结构，遇到无法替换的loop结构，可以剥离loop结构到数据前处理或者后处理中，可参考案例：[PointNet++循环采样结构解决案例](https://gitee.com/wangjiangben_hw/ascend-pytorch-crowdintelligence-doc/blob/master/Ascend-PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86%E6%8C%87%E5%AF%BC/%E4%B8%93%E9%A2%98%E6%A1%88%E4%BE%8B/%E5%8A%9F%E8%83%BD%E6%89%93%E9%80%9A/PoinetNet++%E5%BE%AA%E7%8E%AF%E9%87%87%E6%A0%B7%E7%BB%93%E6%9E%84%E8%A7%A3%E5%86%B3%E6%A1%88%E4%BE%8B.md)
+
 # 3 OM离线推理失败问题汇总
+
 ## 3.1 找不到atc命令或找不到ascend动态库
 - 现象描述
   ```shell
@@ -222,9 +269,9 @@
   bash build_local.sh
   ```
 
-# 4 精度调试常见问题
+# 4 精度相关问题
 
-# 5 性能优化常见问题
+# 5 性能相关问题
 ## 5.1 如何使用AIPP进行性能提升
 原理介绍可以参考[使能AIPP](https://support.huaweicloud.com/atctool-cann503alpha2infer/atlasatc_16_0016.html)
 - 使用方法
@@ -245,7 +292,7 @@
   - 增加atc参数
     在原有atc命令基础上增加 [--enable_small_channel=1](https://support.huaweicloud.com/atctool-cann503alpha2infer/atlasatc_16_0077.html) 和 [--insert_op_conf=path/to/insert_op.cfg](https://support.huaweicloud.com/atctool-cann503alpha2infer/atlasatc_16_0068.html#ZH-CN_TOPIC_0000001152734182)
 
-# 5.2 GPU推理（trtexec）报错
+## 5.2 GPU推理（trtexec）报错
 
 + 错误现象
 
@@ -276,8 +323,8 @@
      对于算子支持问题，默认采用以下两种解决方案：
 
      	1. 采用更高版本的trt版本（默认为trt7.xxx），如trt8
-      	2. 基于onnxruntime进行onnx的离线推理得到性能
-      	3. 基于2.方案仍然不生效，则基于在线推理得到性能
+     	2. 基于onnxruntime进行onnx的离线推理得到性能
+     	3. 基于2.方案仍然不生效，则基于在线推理得到性能
 
   2. 内存限制
 
@@ -300,22 +347,22 @@
   ```python
   import torch
   from torch import nn
-
+  
   class SrcCode(nn.Module):
       def forward(self,x):
           y = x.expand(1024, 1024, 3).transpose(0, 1)
           return y
-
+  
   class Optimizer(nn.Module):
       def forward(self,x):
           t = x.reshape(1024,1,3).repeat((1,1024,1))
           return t
-
+  
   src = SrcCode()
   opt = Optimizer()
   src.eval()
   opt.eval()
-
+  
   input1 = torch.randn(1024,3)
   out = src(input1)
   out2 = opt(input1)
@@ -330,3 +377,68 @@
   性能提升结果：整网性能提升270%，性能达标。
   ![transpose_opt](./images/FAQ002_4.png)
   ![opt_profiling](./images/FAQ002_5.png)
+
+## 5.4 AICPU算子问题
+
++ 问题现象
+
+  + 算子不支持直接，转换模型错误：
+
+    <img src="./images/image-20211223152947663.png" alt="image-20211223152947663" style="zoom:150%;" />
+
+  + 性能问题：
+
+    ![image-20211223154539348](./images/image-20211223154539348.png)
+
++ 原因分析
+
+  相关算子如果不支持某一格式时，算子将不再再AIcore侧进行调用，而是直接走AICPU，会造成一定性能损失，甚至模型转换的时候直接报错，常见如：INT64格式（AICORE默认不支持INT64格式）
+
++ 处理方案
+
+  可以通过改图规避，这里提供通用的转换INT64数据节点为INT32的方案，其他格式转换原理类似：
+
+  ```python
+  import numpy as np
+  from magiconnx import OnnxGraph
+  
+  def value_to_int32(node):
+      node_value = node.value.copy()
+      if (node_value > MAXINT32).any():
+          node_value[node_value>MAXINT32] = MAXINT32
+      if (node_value < MININT32).any():
+          node_value[node_value<MININT32] = MININT32
+      node.value = node_value.astype(np.int32)
+      return node
+  
+  # 转换所有的Constant为Int32格式
+  def convert_all_constants(graph):
+      constant_nodes = graph.get_nodes('Constant')
+      for node in constant_nodes:
+          if np.issubdtype(node.value.dtype, np.int64):
+              node = value_to_int32(node)
+  
+  # 转换所有的Initializer为Int32格式
+  def convert_all_initializers(graph):
+      initializer_nodes = graph.get_nodes('Initializer')
+      for node in initializer_nodes:
+          if np.issubdtype(node.value.dtype, np.int64):
+              node = value_to_int32(node)
+              
+  def insert_cast_node(graph, before_node, node_name, dtype=6):
+      cast_node = graph.add_node(
+          node_name,
+          'Cast',
+          {'to': dtype}
+      )
+      graph.insert_node(before_node, cast_node, mode='after')
+  
+  # 在特定类别算子后插入cast算子进行格式转换
+  def insert_cast_after_shape(graph):
+      shape_nodes = graph.get_nodes("Shape")
+      for node in shape_nodes:
+          node_name = node.name
+          insert_name = 'expand_after_{}'.format(node_name)
+          insert_cast_node(graph, node_name, insert_name)
+  ```
+
diff --git "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/images/image-20211223152947663.png" "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/images/image-20211223152947663.png"
new file mode 100644
index 0000000000000000000000000000000000000000..bcc0fcf81795404e1b9ee345e1c9f9262af62b92
Binary files /dev/null and "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/images/image-20211223152947663.png" differ
diff --git "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/images/image-20211223154539348.png" "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/images/image-20211223154539348.png"
new file mode 100644
index 0000000000000000000000000000000000000000..e767dcd71835abb8e5da54b290daac77e51cc8b4
Binary files /dev/null and "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/images/image-20211223154539348.png" differ
diff --git "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/readme.md" "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/readme.md"
index d3cdddb8e7e65d73f5195e98d23f849906c31ae6..8d3a4a49503ef9f8bbcdca12d7c15f8ca08aa1d6 100644
--- "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/readme.md"
+++ "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/readme.md"
@@ -13,14 +13,14 @@
 
 # 案例 TODO-List
 
-| 所属类别 | 问题描述               | 状态 |
-| -------- | ---------------------- | ---- |
-| 功能打通 | 动态分档问题           | DONE |
-| 功能打通 | 自定义算子问题         | TODO |
-| 功能打通 | 网络模块连续调用问题   | TODO |
-| 性能调优 | AICPU优化问题          | TODO |
-| 相关工具 | 精度调试工具及流程介绍 | TODO |
-| 相关工具 | 性能调优工具及流程介绍 | TODO |
+| 所属类别 | 问题描述                         | 状态                                                         |
+| -------- | -------------------------------- | ------------------------------------------------------------ |
+| 功能打通 | 动态分档问题                     | DONE                                                         |
+| 功能打通 | 自定义算子问题                   | DONE（转移到[FAQ](https://gitee.com/wangjiangben_hw/ascend-pytorch-crowdintelligence-doc/blob/master/Ascend-PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86%E6%8C%87%E5%AF%BC/PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86-FAQ.md)） |
+| 功能打通 | 网络模块连续调用问题（Loop结构） | DONE（转移到[FAQ](https://gitee.com/wangjiangben_hw/ascend-pytorch-crowdintelligence-doc/blob/master/Ascend-PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86%E6%8C%87%E5%AF%BC/PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86-FAQ.md)） |
+| 性能调优 | AICPU优化问题                    | DONE（转移到[FAQ](https://gitee.com/wangjiangben_hw/ascend-pytorch-crowdintelligence-doc/blob/master/Ascend-PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86%E6%8C%87%E5%AF%BC/PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86-FAQ.md)） |
+| 相关工具 | 精度调试工具及流程介绍           | TODO                                                         |
+| 相关工具 | 性能调优工具及流程介绍           | TODO                                                         |
 
 
 
diff --git "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/PoinetNet++\345\276\252\347\216\257\351\207\207\346\240\267\347\273\223\346\236\204\350\247\243\345\206\263\346\241\210\344\276\213.md" "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/PoinetNet++\345\276\252\347\216\257\351\207\207\346\240\267\347\273\223\346\236\204\350\247\243\345\206\263\346\241\210\344\276\213.md"
new file mode 100644
index 0000000000000000000000000000000000000000..2aa545dd5366575c9800bdb3ee2ddde55187c787
--- /dev/null
+++ "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/PoinetNet++\345\276\252\347\216\257\351\207\207\346\240\267\347\273\223\346\236\204\350\247\243\345\206\263\346\241\210\344\276\213.md"
@@ -0,0 +1,212 @@
+# PoinetNet++循环采样结构解决案例
+
+## 模型背景&&问题现象
+
+直接转化PoinetNet++模型，存在问题：
+
++ onnx节点过多根本无法可视化
++ om模型出现domin重复
+
+模型介绍：
+
+[PoinetNet++](https://proceedings.neurips.cc/paper/2017/file/d8bf84be3800d12f74d8b05e9b89836f-Paper.pdf)
+
+![image-20211223192855113](./imgs/image-20211223192855113.png)
+
+发现问题主要出在前置的encoder模块的**set abstraction**部分，其**sampling**部分为降采样的过程是通过循环实现的，这会导致转onnx模型即动态图转化为静态图的时候，不断叠加同样的采样结构（这部分采样结构本身不包含任何学习参数）
+
+## 调试思路
+
+代码分析：
+
+以pointnet_cls_ssg.py为例，其encoder的部分为：self.sa1,self.sa2,self.sa3；注意self.sa3的group_all的参数为True，意味着不做sampling；问题主要出在前面两个模块。
+
+```python
+class get_model(nn.Module):
+    def __init__(self,num_class,normal_channel=True):
+        super(get_model, self).__init__()
+        in_channel = 6 if normal_channel else 3
+        self.normal_channel = normal_channel
+        self.sa1 = PointNetSetAbstraction(npoint=512, radius=0.2, nsample=32, in_channel=in_channel, mlp=[64, 64, 128], group_all=False)
+        self.sa2 = PointNetSetAbstraction(npoint=128, radius=0.4, nsample=64, in_channel=128 + 3, mlp=[128, 128, 256], group_all=False)
+        self.sa3 = PointNetSetAbstraction(npoint=None, radius=None, nsample=None, in_channel=256 + 3, mlp=[256, 512, 1024], group_all=True)
+        self.fc1 = nn.Linear(1024, 512)
+        self.bn1 = nn.BatchNorm1d(512)
+        self.drop1 = nn.Dropout(0.4)
+        self.fc2 = nn.Linear(512, 256)
+        self.bn2 = nn.BatchNorm1d(256)
+        self.drop2 = nn.Dropout(0.4)
+        self.fc3 = nn.Linear(256, num_class)
+
+    def forward(self, xyz):
+        B, _, _ = xyz.shape
+        if self.normal_channel:
+            norm = xyz[:, 3:, :]
+            xyz = xyz[:, :3, :]
+        else:
+            norm = None
+        l1_xyz, l1_points = self.sa1(xyz, norm)
+        l2_xyz, l2_points = self.sa2(l1_xyz, l1_points)
+        l3_xyz, l3_points = self.sa3(l2_xyz, l2_points)
+        x = l3_points.view(B, 1024)
+        x = self.drop1(F.relu(self.bn1(self.fc1(x))))
+        x = self.drop2(F.relu(self.bn2(self.fc2(x))))
+        x = self.fc3(x)
+        x = F.log_softmax(x, -1)
+        return x, l3_points
+```
+
+PointNetSetAbstraction结构如下，采样操作为`sample_and_group`：
+
+```python
+class PointNetSetAbstraction(nn.Module):
+    def __init__(self, npoint, radius, nsample, in_channel, mlp, group_all):
+        super(PointNetSetAbstraction, self).__init__()
+        self.npoint = npoint
+        self.radius = radius
+        self.nsample = nsample
+        self.mlp_convs = nn.ModuleList()
+        self.mlp_bns = nn.ModuleList()
+        last_channel = in_channel
+        for out_channel in mlp:
+            self.mlp_convs.append(nn.Conv2d(last_channel, out_channel, 1))
+            self.mlp_bns.append(nn.BatchNorm2d(out_channel))
+            last_channel = out_channel
+        self.group_all = group_all
+
+    def forward(self, xyz, points):
+        """
+        Input:
+            xyz: input points position data, [B, C, N]
+            points: input points data, [B, D, N]
+        Return:
+            new_xyz: sampled points position data, [B, C, S]
+            new_points_concat: sample points feature data, [B, D', S]
+        """
+        xyz = xyz.permute(0, 2, 1)
+        if points is not None:
+            points = points.permute(0, 2, 1)
+
+        if self.group_all:
+            new_xyz, new_points = sample_and_group_all(xyz, points)
+        else:
+            new_xyz, new_points = sample_and_group(self.npoint, self.radius, self.nsample, xyz, points)
+        # new_xyz: sampled points position data, [B, npoint, C]
+        # new_points: sampled points data, [B, npoint, nsample, C+D]
+        new_points = new_points.permute(0, 3, 2, 1) # [B, C+D, nsample,npoint]
+        for i, conv in enumerate(self.mlp_convs):
+            bn = self.mlp_bns[i]
+            new_points =  F.relu(bn(conv(new_points)), inplace=True)
+
+        new_points = torch.max(new_points, 2)[0]
+        new_xyz = new_xyz.permute(0, 2, 1)
+        return new_xyz, new_points
+```
+
+采样操作如下:
+
+可以看到当开启采样的时候采用的`farthest_point_sample`策略，会循环`npoint`次，对于self.sa1，该值为512，对于self.sa2，该值为128；这里会导致转化静态图的时候采样结构的不断堆叠。
+
+```python
+def sample_and_group(npoint, radius, nsample, xyz, points, returnfps=False):
+    """
+    Input:
+        npoint:
+        radius:
+        nsample:
+        xyz: input points position data, [B, N, 3]
+        points: input points data, [B, N, D]
+    Return:
+        new_xyz: sampled points position data, [B, npoint, nsample, 3]
+        new_points: sampled points data, [B, npoint, nsample, 3+D]
+    """
+    B, N, C = xyz.shape
+    S = npoint
+    fps_idx = farthest_point_sample(xyz, npoint) # [B, npoint, C]
+    new_xyz = index_points(xyz, fps_idx)
+    idx = query_ball_point(radius, nsample, xyz, new_xyz)
+    grouped_xyz = index_points(xyz, idx) # [B, npoint, nsample, C]
+    grouped_xyz_norm = grouped_xyz - new_xyz.view(B, S, 1, C)
+
+    if points is not None:
+        grouped_points = index_points(points, idx)
+        new_points = torch.cat([grouped_xyz_norm, grouped_points], dim=-1) # [B, npoint, nsample, C+D]
+    else:
+        new_points = grouped_xyz_norm
+    if returnfps:
+        return new_xyz, new_points, grouped_xyz, fps_idx
+    else:
+        return new_xyz, new_points
+
+
+def sample_and_group_all(xyz, points):
+    """
+    Input:
+        xyz: input points position data, [B, N, 3]
+        points: input points data, [B, N, D]
+    Return:
+        new_xyz: sampled points position data, [B, 1, 3]
+        new_points: sampled points data, [B, 1, N, 3+D]
+    """
+    device = xyz.device
+    B, N, C = xyz.shape
+    new_xyz = torch.zeros(B, 1, C).to(device)
+    grouped_xyz = xyz.view(B, 1, N, C)
+    if points is not None:
+        new_points = torch.cat([grouped_xyz, points.view(B, 1, N, -1)], dim=-1)
+    else:
+        new_points = grouped_xyz
+    return new_xyz, new_points
+  
+def farthest_point_sample(xyz, npoint):
+    """
+    Input:
+        xyz: pointcloud data, [B, N, 3]
+        npoint: number of samples
+    Return:
+        centroids: sampled pointcloud index, [B, npoint]
+    """
+    device = xyz.device
+    B, N, C = xyz.shape
+    centroids = torch.zeros(B, npoint, dtype=torch.long).to(device)
+    distance = torch.ones(B, N).to(device) * 1e10
+    farthest = torch.randint(0, N, (B,), dtype=torch.long).to(device)
+    batch_indices = torch.arange(B, dtype=torch.long).to(device)
+    for i in range(npoint):
+        centroids[:, i] = farthest
+        centroid = xyz[batch_indices, farthest, :].view(B, 1, 3)
+        dist = torch.sum((xyz - centroid) ** 2, -1)
+        mask = dist < distance
+        distance[mask] = dist[mask]
+        farthest = torch.max(distance, -1)[1]
+    return centroids
+```
+
+## 解决方案
+
+目前建议的方案是以self.sa1为边界分离模型，然后将采样操作作为数据的预处理操作，通过模型的输入传入，即：
+
+![image-20211223193009811](./imgs/image-20211223193009811.png)
+
+采样的得到的结果通过图像预处理的方式先得到，即推理步骤如下：
+
++ 点云预处理通过`farthest_point_sample`得到sample后的结果和原始点云作为part1模型的输入
+
++ 修改part1模型为双输入，进行part1模型的推理（即self.sa1）的输出
+
++ 将self.sa1的结果通过`farthest_point_sample`得到sample后的结果和self.sa1的结果作为part2的输入
+
++ 修改part2模型为双输入，进行part2模型的推理得到最终的输出
+
+  关于精度和性能：
+
+  精度正常比较
+
+  最终模型的性能以两个模型的合并性能为准
+
+## 总结
+
+对于相关循环结构：
+
++ 简单的循环结构可以寻找其他实现方案来规避循环方法，前提是函数比较简单且参数都是固定的
++ 复杂的循环结构往往需要进行网络拆分，将循环部分拆分到外部
\ No newline at end of file
diff --git "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/imgs/image-20211223192855113.png" "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/imgs/image-20211223192855113.png"
new file mode 100644
index 0000000000000000000000000000000000000000..19a415f4444afde5f647d3398303822669192895
Binary files /dev/null and "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/imgs/image-20211223192855113.png" differ
diff --git "a/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/imgs/image-20211223193009811.png" "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/imgs/image-20211223193009811.png"
new file mode 100644
index 0000000000000000000000000000000000000000..5f2df70388a5bceead6bdc9bc22d089b5828258f
Binary files /dev/null and "b/Ascend-PyTorch\347\246\273\347\272\277\346\216\250\347\220\206\346\214\207\345\257\274/\344\270\223\351\242\230\346\241\210\344\276\213/\345\212\237\350\203\275\346\211\223\351\200\232/imgs/image-20211223193009811.png" differ