Closure of DFI effective 10th July 2024

Status
Not open for further replies.

SKC

Senior Member
Joined
Aug 16, 2014
Messages
9,469
Likes
32,241
Country flag
Guys calm down a bit.

Take rest for a day or two.

Let us think together about possible solution.

I will download latest version of PHP Forums CMS and tinker about a bit.

Lets see how to take care of the hosting issues.
 

Extraordinary

Regular Member
Joined
Apr 5, 2024
Messages
183
Likes
328
Country flag
What about those who were refugees from BRF in the first place lol?
Guess it's time to transcend to the higher plain of existence Twitter i mean
A forum has different environment all together.
Even signing up to this forum requires certain IQ.
Facebook Twitter can be used by anyone, this platform offers a difdent type of engagement
 

Jor_Se_Bolo_PKMKB

Regular Member
Joined
May 30, 2019
Messages
402
Likes
2,790
Country flag
From chatGPT... I will attempt to scrape data tomorrow

# Scraping Data from DefenceForumIndia

## Overview

The process of scraping a website, especially a large one like a forum, involves several steps and considerations. Below is a high-level overview of what you need to do to scrape data from DefenceForumIndia.

### Steps

1. **Check the Site's Terms of Service**: Ensure that scraping the site does not violate any terms of service or legal agreements.

2. **Identify the Structure**: Analyze the forum's structure to understand how the data is organized (e.g., threads, posts, user profiles).

3. **Choose a Scraping Tool**: Use tools like `BeautifulSoup` and `requests` in Python, or specialized scraping frameworks like Scrapy.

4. **Write the Scraping Script**:
- Start by writing a script to crawl and download the HTML content of each page.
- Parse the HTML to extract the relevant data (e.g., post content, authors, timestamps).
- Store the extracted data in a structured format (e.g., CSV, JSON, database).

5. **Handle Pagination**: Forums usually have multiple pages for threads and posts. Make sure your script can navigate through these pages.

6. **Be Mindful of Rate Limits**: To avoid getting banned or overloading the server, implement delays between requests and respect any rate limiting specified by the website.

7. **Data Storage**: Decide where to store the scraped data. Options include local files, databases, or cloud storage.

8. **Backup and Redundancy**: Regularly back up the data to prevent loss in case of interruptions during scraping.

### Example Scraping Script

Below is a simplified example using Python with `requests` and `BeautifulSoup`:

```python
import requests
from bs4 import BeautifulSoup
import time
import csv

base_url = 'http://defenceforumindia.com/forums/'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}

def scrape_forum(forum_url, output_file):
session = requests.Session()
session.headers.update(headers)

with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
fieldnames = ['thread_title', 'post_content', 'author', 'timestamp']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()

page = 1
while True:
url = f"{forum_url}page-{page}"
response = session.get(url)
if response.status_code != 200:
break

soup = BeautifulSoup(response.content, 'html.parser')
threads = soup.find_all('div', class_='structItem--thread')
if not threads:
break

for thread in threads:
thread_title = thread.find('div', class_='structItem-title').text.strip()
thread_link = thread.find('a', class_='structItem-title')['href']
thread_url = base_url + thread_link

# Scrape individual thread
thread_response = session.get(thread_url)
thread_soup = BeautifulSoup(thread_response.content, 'html.parser')
posts = thread_soup.find_all('div', class_='message-content')
for post in posts:
post_content = post.find('div', class_='bbWrapper').text.strip()
author = post.find('h4', class_='message-name').text.strip()
timestamp = post.find('time')['datetime']

writer.writerow({
'thread_title': thread_title,
'post_content': post_content,
'author': author,
'timestamp': timestamp
})

page += 1
time.sleep(1) # Be respectful with your requests

scrape_forum('http://defenceforumindia.com/forums/some-forum/', 'forum_data.csv')
 

Extraordinary

Regular Member
Joined
Apr 5, 2024
Messages
183
Likes
328
Country flag
From chatGPT... I will attempt to scrape data tomorrow

# Scraping Data from DefenceForumIndia

## Overview

The process of scraping a website, especially a large one like a forum, involves several steps and considerations. Below is a high-level overview of what you need to do to scrape data from DefenceForumIndia.

### Steps

1. **Check the Site's Terms of Service**: Ensure that scraping the site does not violate any terms of service or legal agreements.

2. **Identify the Structure**: Analyze the forum's structure to understand how the data is organized (e.g., threads, posts, user profiles).

3. **Choose a Scraping Tool**: Use tools like `BeautifulSoup` and `requests` in Python, or specialized scraping frameworks like Scrapy.

4. **Write the Scraping Script**:
- Start by writing a script to crawl and download the HTML content of each page.
- Parse the HTML to extract the relevant data (e.g., post content, authors, timestamps).
- Store the extracted data in a structured format (e.g., CSV, JSON, database).

5. **Handle Pagination**: Forums usually have multiple pages for threads and posts. Make sure your script can navigate through these pages.

6. **Be Mindful of Rate Limits**: To avoid getting banned or overloading the server, implement delays between requests and respect any rate limiting specified by the website.

7. **Data Storage**: Decide where to store the scraped data. Options include local files, databases, or cloud storage.

8. **Backup and Redundancy**: Regularly back up the data to prevent loss in case of interruptions during scraping.

### Example Scraping Script

Below is a simplified example using Python with `requests` and `BeautifulSoup`:

```python
import requests
from bs4 import BeautifulSoup
import time
import csv

base_url = 'http://defenceforumindia.com/forums/'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}

def scrape_forum(forum_url, output_file):
session = requests.Session()
session.headers.update(headers)

with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
fieldnames = ['thread_title', 'post_content', 'author', 'timestamp']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()

page = 1
while True:
url = f"{forum_url}page-{page}"
response = session.get(url)
if response.status_code != 200:
break

soup = BeautifulSoup(response.content, 'html.parser')
threads = soup.find_all('div', class_='structItem--thread')
if not threads:
break

for thread in threads:
thread_title = thread.find('div', class_='structItem-title').text.strip()
thread_link = thread.find('a', class_='structItem-title')['href']
thread_url = base_url + thread_link

# Scrape individual thread
thread_response = session.get(thread_url)
thread_soup = BeautifulSoup(thread_response.content, 'html.parser')
posts = thread_soup.find_all('div', class_='message-content')
for post in posts:
post_content = post.find('div', class_='bbWrapper').text.strip()
author = post.find('h4', class_='message-name').text.strip()
timestamp = post.find('time')['datetime']

writer.writerow({
'thread_title': thread_title,
'post_content': post_content,
'author': author,
'timestamp': timestamp
})

page += 1
time.sleep(1) # Be respectful with your requests

scrape_forum('http://defenceforumindia.com/forums/some-forum/', 'forum_data.csv')
At best you will scrap some text. Photos won't be scrapped that easy.
 

hurrians

Senior Member
Joined
Nov 24, 2023
Messages
1,109
Likes
1,922
Country flag
httrack website copier , don't know wether it works , ofcourse with admin approval

Someone here directly or indirectly maybe knowing the owner, hope they are in touch (in person) and will give clarity going forward
 

hurrians

Senior Member
Joined
Nov 24, 2023
Messages
1,109
Likes
1,922
Country flag
.same site with ownership transferred
.build other site, all xenforo forums has similar Ui
.will depend on what owners tell/allow
 

Haldilal

लड़ते लड़ते जीना है, लड़ते लड़ते मरना है
Senior Member
Joined
Aug 10, 2020
Messages
30,040
Likes
115,405
Country flag
Ya'll Nibbiars The reason is simple, no money issues or Legal troubles. But owner got over the site and decided to close down.

PmWxxClv.jpg_medium.jpg
 

Extraordinary

Regular Member
Joined
Apr 5, 2024
Messages
183
Likes
328
Country flag
Ya'll Nibbiars The reason is simple, no money issues or Legal troubles. But owner got over the site and decided to close down.
He can transfer the ownership and we will take care of the cost.
Resolving legal trouble might be difficult.
 

hurrians

Senior Member
Joined
Nov 24, 2023
Messages
1,109
Likes
1,922
Country flag
.most of them started teen years
.now grown ups with career
.so was visible when admins were able to focus less here
 

mokoman

Senior Member
Joined
May 31, 2020
Messages
6,484
Likes
34,871
Country flag
There are 3 million post on this website let's chalk it up to 20 kb for each posts than its not more than.
We will have 60 GBs worth of Text data.
But the space takes are the image files those probably take up several Terabytes.
5 to 7 terabytes of Image easy.
nice looks like it . 2742777 posts .

its there in the post link
 

angryIndian

Senior Member
Joined
Feb 26, 2013
Messages
1,047
Likes
4,268
Country flag
Oh no!
I grew up with MP.net,Bharat-Rakshak and DFI. It is disheartening to learn that one of the most respected forums in this space is shutting down.
 

shashankk

Regular Member
Joined
Jan 28, 2018
Messages
835
Likes
4,135
Country flag
Oh no!
I grew up with MP.net,Bharat-Rakshak and DFI. It is disheartening to learn that one of the most respected forums in this space is shutting down.
bhai tu abhi forum world me jinda hai. :) glad to see you .
 

Jimih

Senior Member
Joined
May 20, 2021
Messages
22,992
Likes
134,622
Country flag
What about those who were refugees from BRF in the first place lol?
Guess it's time to transcend to the higher plain of existence Twitter i mean
90% of current DFI users will be banned within 2 days in BRF.

They don't allow shitposting and flamebaiting.

Registering through proper Email is also mandatory there.
 
Status
Not open for further replies.

Latest Replies

Global Defence

New threads

Articles

Top