School of systems science and engineering;
Sun Yat-Sen University;
Guangzhou 100876;
China;
Institute of Systems Engineering;
AMS;
PLA;
Beijing 100141;
China;
multi-beam satellite communications; time-frequency resource allocation; multi-objective optimization; deep reinforcement learning;