Ship trajectory prediction is crucial for collision avoidance and maritime traffic management. In an encounter scenario, a ship must predict the future states of other ships to take effective collision avoidance actions. However, conventional models often fail to accurately capture the interactions between ships in converging waters, resulting in poor trajectory prediction. This paper proposes a new model, STETC, which combines a Transformer and a convolutional neural network (CNN) into a two-layer encoder–decoder structure. The motion path encoder employs a self-attention mechanism to extract motion features from historical trajectories. The encounter interaction encoder integrates a CNN and self-attention mechanisms to extract interaction features from an artificial potential field generated by ship dynamic parameters. The two decoders then use cross-attention to progressively establish the spatiotemporal relationships between motion and interaction features. Through training, the model learns the interaction patterns between ships and the dynamic development of encounter situations. Real ship trajectory data is used to validate the effectiveness of STETC. Comparison experiments with observation times of 5, 10, and 15 min demonstrate that the STETC model outperformed other models. The case study validates that the STETC model can precisely perceive the motion states of surrounding ships and, leveraging encounter scenarios, generates predicted trajectories that reflect the interactive behaviors of the ships more accurately.



