Ratnesh Kumar, Edwin Weill, Farzin Aghdasi and Parthasarathy Sriram
In this paper we tackle the problem of vehicle re-identification in a camera network utilizing triplet embeddings. Re-identification is the problem of matching appearances of objects across different cameras. With the proliferation of surveillance cameras enabling smart and safer cities, there is an ever-increasing need to re-identify vehicles across cameras. Typical challenges arising in smart city scenarios include variations of viewpoints, illumination and self occlusions. Most successful approaches for re-identification involve (deep) learning an embedding space such that the vehicles of same identities are projected closer to one another, compared to the vehicles representing different identities. Popular loss functions for learning an embedding (space) include contrastive or triplet loss. In this paper we provide an extensive evaluation of triplet loss applied to vehicle re-identification and demonstrate that using the recently proposed sampling approaches for mining informative data points outperform most of the existing state-of-the-art approaches for vehicle re-identification. Compared to most existing state-of-the-art approaches, our approach is simpler and more straightforward for training utilizing only identity-level annotations, along with one of the smallest published embedding dimensions for efficient inference. Furthermore in this work we introduce a formal evaluation of a triplet sampling variant (batch sample) into the re-identification literature. In addition to the conference version , this submission adds extensive experiments on new released datasets, cross domain evaluations and ablation studies.