Commit graph

  • d6f4de006e Merge branch 'release/0.1.1' into 'master' master Pablo Martin 2021-01-09 16:52:30 +00:00
  • a0d11dcdd6 Merge branch 'fix/capturer_broken_imports' into 'integration' Pablo Martin 2021-01-09 16:49:28 +00:00
  • c7ddbb035f Added sys.path trickery to make imports work again when executing out of Pycharm. pablo 2021-01-09 17:48:24 +01:00
  • 744a0a38d4 Merge branch 'release/0.1.0' into 'master' Pablo Martin 2021-01-06 09:45:45 +00:00
  • 575dadaaff More logging. pablo 2021-01-06 10:45:03 +01:00
  • 5e023edb00 Added a few logging points. pablo 2021-01-06 10:43:16 +01:00
  • f10b62bfd2 Reversed condition. pablo 2021-01-06 10:40:07 +01:00
  • 50a56091b9 Added missing logger import. pablo 2021-01-06 10:38:30 +01:00
  • 3740ab2ada Merge branch 'fix/dead_ad_string' into 'integration' Pablo Martin 2021-01-04 21:29:52 +00:00
  • 639de7c602 Change strings to look for in HTML. Chores. pablo 2021-01-04 22:29:01 +01:00
  • 6122f74e99 Merge branch 'refactor/capturer_improved' into 'integration' Pablo Martin 2021-01-04 21:23:16 +00:00
  • cbf1643fb5 Formatting, docstrings and other chores. pablo 2021-01-04 22:17:40 +01:00
  • adf2cd26ba Minor fix regarding issue spotting in parsing. pablo 2021-01-04 21:56:24 +01:00
  • cf4ce06b57 Implemented tests for CapturingTask. A few mock classes where needed. pablo 2021-01-03 20:06:28 +01:00
  • 007f458cd5 Minor fixes. pablo 2021-01-03 20:05:34 +01:00
  • e34a34acaf Fix in throttling test so it doesn't fail around midnight. pablo 2021-01-02 23:49:10 +01:00
  • def858ef6a Modified input format of instructions for ParsingFlowGenerator. Previous dict wouldn't allow for more than one SecondaryFeaturesFieldInstructions class pointer. pablo 2020-12-31 19:02:09 +01:00
  • 2b249063e0 Created a new flow generator + tests for it. pablo 2020-12-31 18:28:48 +01:00
  • b8d4893026 Mini syntax fix. pablo 2020-12-31 18:14:44 +01:00
  • cb553b5f7e Minor fixes in parsing utils. pablo 2020-12-29 20:42:21 +01:00
  • 3b79ba06d8 Created parsing_utils module to refactor HTML parsing and validation actions. pablo 2020-12-29 17:38:17 +01:00
  • 3f9a6d8e53 Integrated throttling in capturer. pablo 2020-12-27 12:35:02 +01:00
  • d136144a4e Throttling checks are now lazy. pablo 2020-12-26 20:25:56 +01:00
  • 2a9483981e Implemented a new throttling module to remove redundance in the project. pablo 2020-12-26 18:54:04 +01:00
  • f207dd5dda Started integration branch. pablo 2020-12-26 12:12:44 +01:00
  • 0086cf2b4c Improved logging in refresher.py pablo 2020-11-15 13:21:08 +01:00
  • e939d67467 Improvements in listing page URL generation. pablo 2020-11-15 12:54:17 +01:00
  • a61fac72f7 Typing, docstrings and formatting of explorer.py pablo 2020-11-03 21:55:09 +01:00
  • f53a65834b Turned method static. pablo 2020-11-03 14:00:51 +01:00
  • 43236c2884 Typing, docstrings, formatting for capturer.py pablo 2020-11-03 13:50:36 +01:00
  • 3cf7dd8bd9 Typing, docstrings, formatting for mysql_wrapper.py pablo 2020-11-03 08:44:37 +01:00
  • e9ee23f852 Typing, docstrings, formatting for scrapping_utils.py pablo 2020-11-03 07:43:21 +01:00
  • a79fc533ee Formatting. pablo 2020-11-03 07:29:17 +01:00
  • cd9c3b6e39 Some changes. pablo 2020-11-03 07:26:06 +01:00
  • 9e7194c8d9 URLAttacks now share a common session. pablo 2020-11-02 13:08:37 +01:00
  • db04a67c4c More testing code. pablo 2020-11-02 12:51:20 +01:00
  • c337a33feb More testing code. pablo 2020-11-02 12:43:49 +01:00
  • 81112a4cb9 More testing code. pablo 2020-11-02 12:02:56 +01:00
  • 51c4bdb347 Fixes. Code version for mysql. pablo 2020-05-08 09:26:29 +02:00
  • 596aaa1393 . pablo 2020-05-05 11:36:28 +02:00
  • 8d4c082a18 Format. pablo 2020-04-26 15:06:04 +02:00
  • 923649a099 Format. Random headers pablo 2020-04-26 15:05:40 +02:00
  • af11a2e87f Weird waiting distribution implemented pablo 2020-04-26 14:54:27 +02:00
  • df032328e9 Formatting and todos. pablo 2020-04-25 18:26:22 +02:00
  • f0fe2b9780 Updated headers. pablo 2020-04-25 18:17:43 +02:00
  • c8ea77e99a Logging en explorer.py pablo 2020-03-26 11:47:12 +01:00
  • acfeeef0d1 Formatting pablo 2020-03-26 11:38:08 +01:00
  • 9c2565f5d8 Logging and formatting pablo 2020-03-26 11:37:32 +01:00
  • cdbb6b5325 Added logging to geocoder.py pablo 2020-03-26 11:30:09 +01:00
  • a9242b2f3a Added logging config. pablo 2020-03-26 11:18:14 +01:00
  • 812bb66219 Refactor en geocoder para evitar error con respuestas sin resultados. pablomartincalvo 2019-02-09 17:25:27 +01:00
  • 5ec97ad008 Pequeños refactorings. pablomartincalvo 2019-01-07 18:09:52 +01:00
  • 227f298d8b Eliminados prints innecesarios del capturer. pablomartincalvo 2019-01-03 19:32:40 +01:00
  • 38984822a7 Actualizados headers de los ataques a URL por cambios en idealsita. pablomartincalvo 2018-12-30 19:28:05 +01:00
  • ed32b15bc1 Cambios en validacion del parser pablomartincalvo 2018-12-30 12:06:23 +01:00
  • 98165ce8f0 Cambios en validacion del parser pablomartincalvo 2018-12-29 11:37:43 +01:00
  • 368f8a00bb Merge branch 'dev' pablomartincalvo 2018-12-25 18:54:06 +01:00
  • 9e251783dc Corregido validacion del telefono pablomartincalvo 2018-12-25 18:53:20 +01:00
  • c234679a10 Testeado el batch de indices en dev. pablomartincalvo 2018-12-23 18:30:11 +01:00
  • d71b69a611 Nuevos modulos para analisis pablomartincalvo 2018-12-21 19:17:39 +01:00
  • 965f55755a Merge branch 'dev' pablomartincalvo 2018-12-18 20:05:27 +01:00
  • e304069684 Alterado refresher para no necesitar comprobar si hay anuncios viejos. pablomartincalvo 2018-12-18 20:05:08 +01:00
  • 5b245c0aed Merge branch 'dev' pablomartincalvo 2018-12-04 21:02:56 +01:00
  • 5aba6309f0 Transferido el espaciado entre intentos a la memoria de python en lugar de a una comprobacion de base de datos. Ajustado algunos tiempos. pablomartincalvo 2018-12-04 21:02:30 +01:00
  • eeb8672f0d Esqueleto de los cambios necesarios para añadir informacion de visitas al sistema. pablomartincalvo 2018-12-02 18:53:28 +01:00
  • e48975342f Merge branch 'dev' pablomartincalvo 2018-12-02 12:29:09 +01:00
  • 1168cc8ad8 Corregido error en refresher que ponia multiples veces el mismo anuncio viejo en la cola. pablomartincalvo 2018-12-02 12:28:27 +01:00
  • 6685edc15b Merge branch 'dev' pablomartincalvo 2018-12-02 12:27:14 +01:00
  • b85d7f0388 Corregido error en refresher que ponia multiples veces el mismo anuncio viejo en la cola. pablomartincalvo 2018-12-02 12:25:18 +01:00
  • 5e80f3e35c Merge branch 'dev' pablomartincalvo 2018-12-01 16:36:10 +01:00
  • b4b5180bd8 lol pablomartincalvo 2018-12-01 16:35:54 +01:00
  • 29f7401c71 Convertido en configurables los tiempos de espera de los servicios. pablomartincalvo 2018-12-01 16:26:25 +01:00
  • 99d5d36bf4 lol pablomartincalvo 2018-12-01 12:01:11 +01:00
  • ec54a67bd0 Remove .idea from repo pablomartincalvo 2018-12-01 11:55:14 +01:00
  • 448a5dd261 Corregido error en capturer cuando cae un 404. pablomartincalvo 2018-11-27 20:37:21 +01:00
  • 02dfa06b36 Añadidos requirements. pablomartincalvo 2018-11-17 12:58:16 +01:00
  • 755aaa79cd Merge remote-tracking branch 'origin/master' pablomartincalvo 2018-11-16 19:49:45 +01:00
  • 3bbe7475e2 Correciones para deteccion de anuncios dados de baja. pablomartincalvo 2018-11-16 18:45:42 +01:00
  • df07497125 Correciones para deteccion de anuncios dados de baja. Mejoras en script de deployment. pablomartincalvo 2018-11-16 18:20:50 +01:00
  • ee8df1b635 Corregido queries de anuncios viejos. pablomartincalvo 2018-11-05 23:40:54 +01:00
  • 71456d3c92 Adaptado capturer y base de datos para soportar datos de m2 con decimales. pablomartincalvo 2018-11-05 20:49:54 +01:00
  • dd3362aa3c Typo en capturer pablomartincalvo 2018-11-05 20:09:04 +01:00
  • 4ec8e3210c Mas correciones menores para testing. pablomartincalvo 2018-11-04 20:16:37 +01:00
  • 403bb2c0cc Mas correciones menores para testing. pablomartincalvo 2018-11-04 19:52:47 +01:00
  • 94b604997c Mas correciones menores para testing. pablomartincalvo 2018-11-02 19:21:52 +01:00
  • 906d8b5cd9 Mas cambios en deployer. pablomartincalvo 2018-11-01 19:50:38 +01:00
  • 9a7ba03cd9 Avances en sistema de deployment y configuracion. pablomartincalvo 2018-10-29 21:57:20 +01:00
  • a2dcec95f4 Corregido criterio para identificar anuncios muertos en el capturer. pablomartincalvo 2018-10-26 20:34:43 +02:00
  • 25e52a9e25 Creando scripts de deployment del sistema. pablomartincalvo 2018-10-26 20:28:07 +02:00
  • a3a2165f43 Testeando error en geocoder. pablomartincalvo 2018-10-23 23:18:12 +02:00
  • 06e1f78f40 Testeando error en geocoder. pablomartincalvo 2018-10-23 20:49:37 +02:00
  • 29072f0926 Testeando error en geocoder. pablomartincalvo 2018-10-22 00:15:44 +02:00
  • 0248e75606 Testeando error en geocoder. pablomartincalvo 2018-10-22 00:01:36 +02:00
  • bebfe12d74 Testeando error en geocoder. pablomartincalvo 2018-10-21 17:42:14 +02:00
  • 98b9c48a6a Testeando error en geocoder. pablomartincalvo 2018-10-21 14:01:15 +02:00
  • 600ff889be Retoques menores en geocoder y capturer por problemas de tipos. pablomartincalvo 2018-10-20 15:58:37 +02:00
  • c40d39e558 Merge branch 'dev' into testing pablomartincalvo 2018-10-19 19:18:14 +02:00
  • 3a4a4fc195 Finalizado y testado localmente el modulo de Geocoding, listo para probar en entorno integrado. pablomartincalvo 2018-10-19 19:17:48 +02:00
  • 5d261328b8 Ajustado error de tipos en dead_ad_checker. pablomartincalvo 2018-10-19 18:05:35 +02:00
  • c3c16e7015 Arreglos menores en capturer y refresher. pablomartincalvo 2018-10-19 17:22:09 +02:00